Reputation: 184
So after going through multiple questions regarding the alignment using format specifiers I still can't figure out why the numerical data gets printed to stdout in a wavy fashion.
def create_data(soup_object,max_entry=None):
max_=max_entry
entry=dict()
for a in range(1,int(max_)+1):
entry[a]={'Key':a,
'Title':soup_object[a].div.text.strip(),
'Link':soup_object[a].div.a['href'],
'Seeds':soup_object[a](attrs={'align':'right'})[0].text.strip(),
'Leechers':soup_object[a](attrs={'align':'right'})[1].text.strip()}
yield entry[a]
tpb_get_data=tuple(create_data(soup_object=tpb_soup.body.table.find_all("tr"),max_entry=5))
for data in tpb_get_data:
print('{0} {1:<11} {2:<25} {3:<25} '.format(data['Key'], data['Title'], data['Seeds'],data['Leechers']))
I tried using f-strings with the formatting specifiers but still it prints the data in the following way, can someone please help me figure this out.
1 Salvation.S02E11.HDTV.x264-KILLERS 262 19
2 Salvation.S02E13.WEB.x264-TBS[ettv] 229 25
3 Salvation.S02E08.HDTV.x264-KILLERS 178 21
4 Salvation.S02E01.HDTV.x264-KILLERS 144 11
5 Salvation.S02E09.HDTV.x264-SVA[ettv] 129 14
I have read most of the questions regarding this, I would like to know if there is a raw method rather than using a library like tabulate which does an excellent job. But I also want to learn how to do this without any library.
Upvotes: 0
Views: 1349
Reputation: 735
Great answer buy @Jongware, just to
here it is:
def print_list_of_dicts_as_table(list_of_dicts, keys=None):
# assuming all dicts have same keys
first_entry = list_of_dicts[0]
if keys is None:
keys = first_entry.keys()
num_keys = len(keys)
max_key_lens = [
max(len(str(item[k])) for item in list_of_dicts) for k in keys
]
for k_idx, k in enumerate(keys):
max_key_lens[k_idx] = max(max_key_lens[k_idx], len(k))
fmtstring = (' | '.join(['{{:{:d}}}'] * num_keys)).format(*max_key_lens)
print(fmtstring.format(*first_entry.keys()))
print(fmtstring.format(*['-'*key_len for key_len in max_key_lens]))
for entry in list_of_dicts:
print(fmtstring.format(*entry.values()))
Usage example:
a=[{'a':'asdd','b':'asd'},{'a':'a','b':'asdsd'},{'a':1,'b':232323}]
print_list_of_dicts_as_table(a)
Output:
a | b
---- | ------
asdd | asd
a | asdsd
1 | 232323
Upvotes: 1
Reputation: 13717
As already mentioned, you calculated lengths of strings incorrectly.
Instead of hardcoding them, delegate this task to your program.
Here is a general approach:
from operator import itemgetter
from typing import (Any,
Dict,
Iterable,
Iterator,
List,
Sequence)
def max_length(objects: Iterable[Any]) -> int:
"""Returns maximum string length of a sequence of objects"""
strings = map(str, objects)
return max(map(len, strings))
def values_max_length(dicts: Sequence[Dict[str, Any]],
*,
key: str) -> int:
"""Returns maximum string length of dicts values for specific key"""
return max_length(map(itemgetter(key), dicts))
def to_aligned_data(dicts: Sequence[Dict[str, Any]],
*,
keys: List[str],
sep: str = ' ') -> Iterator[str]:
"""Prints a sequence of dicts in a form of a left aligned table"""
lengths = (values_max_length(dicts, key=key)
for key in keys)
format_string = sep.join(map('{{:{}}}'.format, lengths))
for row in map(itemgetter(*keys), dicts):
yield format_string.format(*row)
Examples:
data = [{'Key': '1',
'Title': 'Salvation.S02E11.HDTV.x264-KILLERS',
'Seeds': '262',
'Leechers': '19'},
{'Key': '2',
'Title': 'Salvation.S02E13.WEB.x264-TBS[ettv]',
'Seeds': '229',
'Leechers': '25'},
{'Key': '3',
'Title': 'Salvation.S02E08.HDTV.x264-KILLERS',
'Seeds': '178',
'Leechers': '21'},
{'Key': '4',
'Title': 'Salvation.S02E01.HDTV.x264-KILLERS',
'Seeds': '144',
'Leechers': '11'},
{'Key': '5',
'Title': 'Salvation.S02E09.HDTV.x264-SVA[ettv]',
'Seeds': '129',
'Leechers': '14'}]
keys = ['Key', 'Title', 'Seeds', 'Leechers']
print(*to_aligned_data(data, keys=keys),
sep='\n')
# 1 Salvation.S02E11.HDTV.x264-KILLERS 262 19
# 2 Salvation.S02E13.WEB.x264-TBS[ettv] 229 25
# 3 Salvation.S02E08.HDTV.x264-KILLERS 178 21
# 4 Salvation.S02E01.HDTV.x264-KILLERS 144 11
# 5 Salvation.S02E09.HDTV.x264-SVA[ettv] 129 14
keys = ['Title', 'Leechers']
print(*to_aligned_data(data, keys=keys),
sep='\n')
# Salvation.S02E11.HDTV.x264-KILLERS 19
# Salvation.S02E13.WEB.x264-TBS[ettv] 25
# Salvation.S02E08.HDTV.x264-KILLERS 21
# Salvation.S02E01.HDTV.x264-KILLERS 11
# Salvation.S02E09.HDTV.x264-SVA[ettv] 14
keys = ['Key', 'Title', 'Seeds', 'Leechers']
print(*to_aligned_data(data, keys=keys, sep=' ' * 5),
sep='\n')
# 1 Salvation.S02E11.HDTV.x264-KILLERS 262 19
# 2 Salvation.S02E13.WEB.x264-TBS[ettv] 229 25
# 3 Salvation.S02E08.HDTV.x264-KILLERS 178 21
# 4 Salvation.S02E01.HDTV.x264-KILLERS 144 11
# 5 Salvation.S02E09.HDTV.x264-SVA[ettv] 129 14
See docs for more. There are examples with alignment as well.
Upvotes: 2
Reputation: 22478
You get a misaligned result because you did not count the length of the titles correct. You only reserved 11 characters, where the first is already 34 characters long.
Easiest is to have your program count for you:
key_len,title_len,seed_len,leech_len = ( max(len(item[itemname]) for item in tpb_get_data) for itemname in ['Key','Title','Seeds','Leechers'] )
fmtstring = '{{:{:d}}} {{:{:d}}} {{:{:d}}} {{:{:d}}}'.format(key_len,title_len,seed_len,leech_len)
for data in tpb_get_data:
print(fmtstring.format(data['Key'], data['Title'], data['Seeds'],data['Leechers']))
with the much better result
1 Salvation.S02E11.HDTV.x264-KILLERS 262 19
2 Salvation.S02E13.WEB.x264-TBS[ettv] 229 25
3 Salvation.S02E08.HDTV.x264-KILLERS 178 21
4 Salvation.S02E01.HDTV.x264-KILLERS 144 11
5 Salvation.S02E09.HDTV.x264-SVA[ettv] 129 14
(Additional only)
Here is a more generalized approach that uses a list of to-print key names and is able to generate all other required variables on the fly. It does not need hardcoding the names of the variables nor fixating their order – the order is taken from that list. Adjustments of the items to show all go in one place: that same list, get_items
. The output separator can be changed in the fmtstring
line, for example using a tab or more spaces between the items.
get_items = ['Key','Title','Leechers','Seeds']
lengths = ( max(len(item[itemname]) for item in tpb_get_data) for itemname in get_items )
fmtstring = ' '.join(['{{:{:d}}}' for i in range(len(get_items))]).format(*lengths)
for data in tpb_get_data:
print(fmtstring.format(*[data[key] for key in get_items]))
It works as follows:
lengths
list is filled with the maximum length of each named key taken from the get_items
list.list
; the fmtstring
repeats the format instruction {:d}
for each of these items and fills in the number. The outer {{:
and }}
gets translated by format
into {:
and }
so the end result will be {:number}
for each length. These separate format strings are joined into a single longer format string.get_items
. The list comprehension looks them up; the *
notation forces the list to be 'written out' as separate values, instead of returning the entire list as one.Thanks to @Georgy for suggesting to look for a less hardcoded variety.
Upvotes: 4