Reputation: 3781
I have a file in the following format;
Line 1 {"name": "Hotel Eiffel Petit Louvre", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "3.870967741935484", "stars": "2.5", "max_price": "324", "min_price": "117", "ref": "208100", "review": "Within walking distance of the Eiffel Tower."}
Line 2 {"name": "Novotel Paris Centre Tour Eiffel", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "3.1739130434782608", "stars": "4", "max_price": "271", "min_price": "149", "ref": "233528", "review": "Close to Seine river and Eiffel Tower."}
Line 3 {"name": "Hotel Tourisme Avenue", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.703125", "stars": "3", "max_price": "285", "min_price": "130", "ref": "558849", "review": "Close to the Eiffel Tower and metro station literally right outside the door."}
Line 4 {"name": "Hotel du Champ de Mars", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.714285714285714", "stars": "3", "max_price": "255", "min_price": "189", "ref": "570544", "review": "Very close to everything including the Eiffel Tower."}
Line 5 {"name": "Le Derby Alma", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.707865168539326", "stars": "4", "max_price": "418", "min_price": "210", "ref": "240927", "review": "Only a couple of blocks from the Eiffel Tower."}
Line 6 {"name": "Hotel Eiffel Seine", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.237288135593221", "stars": "0", "max_price": "297", "min_price": "141", "ref": "572984", "review": "Driectly next to 2 amazing cafes and literally only a 4 minute walk to the Eiffel Tower."}
Line 7 {"name": "Hotel Galileo", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.5396825396825395", "stars": "3", "max_price": "599", "min_price": "90", "ref": "197576", "review": "Within walking distance to the Eiffel Tower and many other attractions."}
Line 8 {"name": "Hotel Eiffel Seine", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.237288135593221", "stars": "0", "max_price": "297", "min_price": "141", "ref": "572984", "review": "Only a few blocks from Eiffel tower and about a short block from river Seine."}
Line 9 {"name": "Hotel Relais Bosquet Paris", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.8", "stars": "3", "max_price": "332", "min_price": "145", "ref": "229602", "review": "Very close to the metro station, restaurants and the Eiffel Tower!"}
Line 10 {"name": "Hotel Le Marquis", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.673333333333333", "stars": "4", "max_price": "368", "min_price": "155", "ref": "290384", "review": "Near a metro station, a few blocks from the Eiffel tower, and a grocery store across the street."}
Line 11 {"name": "Hotel Relais Bosquet Paris", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.8", "stars": "3", "max_price": "332", "min_price": "145", "ref": "229602", "review": "Located a 10 minute walk to Eiffel Tower."}
Line 12 {"name": "Hotel Eiffel Petit Louvre", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "3.870967741935484", "stars": "2.5", "max_price": "324", "min_price": "117", "ref": "208100", "review": "Metro station is literally across the road."}
Line 13 {"name": "Novotel Paris Centre Tour Eiffel", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "3.1739130434782608", "stars": "4", "max_price": "271", "min_price": "149", "ref": "233528", "review": "Its about 1.5 kms from Eiffel Tower and about 3 kms from Champ de ellesse."}
Line 14 {"name": "Hotel Tourisme Avenue", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.703125", "stars": "3", "max_price": "285", "min_price": "130", "ref": "558849", "review": "It is conveniently located a few steps (literally) from the Metro, about a 7 mins walk from the Eiffel Tower, there is a supermarket across the street, a bakery two stores down, and many cafes and restaurants close by."}
Line 15 {"name": "Hotel du Champ de Mars", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.714285714285714", "stars": "3", "max_price": "255", "min_price": "189", "ref": "570544", "review": "Location is absolutely brilliant, only a few mins to Ecole Militaire metro and 15min walk to the Eiffel Tower."}
Line 16 {"name": "Le Derby Alma", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.707865168539326", "stars": "4", "max_price": "418", "min_price": "210", "ref": "240927", "review": "Very nice small hotel right by the Eiffel tower."}
Line 17 {"name": "Hotel Galileo", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.5396825396825395", "stars": "3", "max_price": "599", "min_price": "90", "ref": "197576", "review": "It’s a small hotel near Champs-Elysées!!!"}
Line 18 {"name": "Hotel Le Marquis", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.673333333333333", "stars": "4", "max_price": "368", "min_price": "155", "ref": "290384", "review": "Fantastic Boutique Hotel, Location only 5 mins walk to Eiffel Tower."}
For the sake of convenience, I have given an example of 18 lines. But I have a file with millions of lines. What would be the fastest way with the minimum latency to group the lines by "name" with the minimum order change, like following?
Line 1 {"name": "Hotel Eiffel Petit Louvre", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "3.870967741935484", "stars": "2.5", "max_price": "324", "min_price": "117", "ref": "208100", "review": "Within walking distance of the Eiffel Tower."}
Line 12 {"name": "Hotel Eiffel Petit Louvre", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "3.870967741935484", "stars": "2.5", "max_price": "324", "min_price": "117", "ref": "208100", "review": "Metro station is literally across the road."}
Line 2 {"name": "Novotel Paris Centre Tour Eiffel", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "3.1739130434782608", "stars": "4", "max_price": "271", "min_price": "149", "ref": "233528", "review": "Close to Seine river and Eiffel Tower."}
Line 13 {"name": "Novotel Paris Centre Tour Eiffel", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "3.1739130434782608", "stars": "4", "max_price": "271", "min_price": "149", "ref": "233528", "review": "Its about 1.5 kms from Eiffel Tower and about 3 kms from Champ de ellesse."}
Line 3 {"name": "Hotel Tourisme Avenue", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.703125", "stars": "3", "max_price": "285", "min_price": "130", "ref": "558849", "review": "Close to the Eiffel Tower and metro station literally right outside the door."}
Line 14 {"name": "Hotel Tourisme Avenue", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.703125", "stars": "3", "max_price": "285", "min_price": "130", "ref": "558849", "review": "It is conveniently located a few steps (literally) from the Metro, about a 7 mins walk from the Eiffel Tower, there is a supermarket across the street, a bakery two stores down, and many cafes and restaurants close by."}
Line 4 {"name": "Hotel du Champ de Mars", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.714285714285714", "stars": "3", "max_price": "255", "min_price": "189", "ref": "570544", "review": "Very close to everything including the Eiffel Tower."}
Line 15 {"name": "Hotel du Champ de Mars", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.714285714285714", "stars": "3", "max_price": "255", "min_price": "189", "ref": "570544", "review": "Location is absolutely brilliant, only a few mins to Ecole Militaire metro and 15min walk to the Eiffel Tower."}
Line 5 {"name": "Le Derby Alma", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.707865168539326", "stars": "4", "max_price": "418", "min_price": "210", "ref": "240927", "review": "Only a couple of blocks from the Eiffel Tower."}
Line 16 {"name": "Le Derby Alma", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.707865168539326", "stars": "4", "max_price": "418", "min_price": "210", "ref": "240927", "review": "Very nice small hotel right by the Eiffel tower."}
Line 6 {"name": "Hotel Eiffel Seine", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.237288135593221", "stars": "0", "max_price": "297", "min_price": "141", "ref": "572984", "review": "Driectly next to 2 amazing cafes and literally only a 4 minute walk to the Eiffel Tower."}
Line 8 {"name": "Hotel Eiffel Seine", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.237288135593221", "stars": "0", "max_price": "297", "min_price": "141", "ref": "572984", "review": "Only a few blocks from Eiffel tower and about a short block from river Seine."}
Line 7 {"name": "Hotel Galileo", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.5396825396825395", "stars": "3", "max_price": "599", "min_price": "90", "ref": "197576", "review": "Within walking distance to the Eiffel Tower and many other attractions."}
Line 17 {"name": "Hotel Galileo", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.5396825396825395", "stars": "3", "max_price": "599", "min_price": "90", "ref": "197576", "review": "It’s a small hotel near Champs-Elysées!!!"}
Line 9 {"name": "Hotel Relais Bosquet Paris", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.8", "stars": "3", "max_price": "332", "min_price": "145", "ref": "229602", "review": "Very close to the metro station, restaurants and the Eiffel Tower!"}
Line 11 {"name": "Hotel Relais Bosquet Paris", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.8", "stars": "3", "max_price": "332", "min_price": "145", "ref": "229602", "review": "Located a 10 minute walk to Eiffel Tower."}
Line 10 {"name": "Hotel Le Marquis", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.673333333333333", "stars": "4", "max_price": "368", "min_price": "155", "ref": "290384", "review": "Near a metro station, a few blocks from the Eiffel tower, and a grocery store across the street."}
Line 18 {"name": "Hotel Le Marquis", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.673333333333333", "stars": "4", "max_price": "368", "min_price": "155", "ref": "290384", "review": "Fantastic Boutique Hotel, Location only 5 mins walk to Eiffel Tower."}
I heard that it is possible to do it with jq
. If so, what would be the command look like? If there are faster tools, I would love to know.
Note: The following must be the 3rd line!
Line 2 {"name": "Novotel Paris Centre Tour Eiffel", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "3.1739130434782608", "stars": "4", "max_price": "271", "min_price": "149", "ref": "233528", "review": "Close to Seine river and Eiffel Tower."}
Best,
Upvotes: 1
Views: 71
Reputation: 116900
What would be the fastest way with the minimum latency to group the lines by "name" with the minimum order change
In brief - use GROUP_BY/2
, defined by:
def GROUP_BY(stream;f): reduce stream as $x ({}; .[$x|f] += [$x]);
In your case, you'd use this as follows:
GROUP_BY(inputs; .name)[][]
with invocation along the lines of: jq -cnf program.jq lines.json
(Notice: no slurping!)
"minimum order change" is accomplished because jq constructs objects incrementally, adding new keys after old ones.
"fastest way" is accomplished because this solution does not involve the sorting of the input.
"minimum latency" is accomplished because the input is not "slurped".
Upvotes: 1
Reputation: 36296
If the JSON content is always structured the same way (.name
up front), it'd suffice to use sort
from GNU coreutils:
sort file.json
{"name": "Hotel du Champ de Mars", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.714285714285714", "stars": "3", "max_price": "255", "min_price": "189", "ref": "570544", "review": "Location is absolutely brilliant, only a few mins to Ecole Militaire metro and 15min walk to the Eiffel Tower."}
{"name": "Hotel du Champ de Mars", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.714285714285714", "stars": "3", "max_price": "255", "min_price": "189", "ref": "570544", "review": "Very close to everything including the Eiffel Tower."}
{"name": "Hotel Eiffel Petit Louvre", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "3.870967741935484", "stars": "2.5", "max_price": "324", "min_price": "117", "ref": "208100", "review": "Metro station is literally across the road."}
{"name": "Hotel Eiffel Petit Louvre", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "3.870967741935484", "stars": "2.5", "max_price": "324", "min_price": "117", "ref": "208100", "review": "Within walking distance of the Eiffel Tower."}
{"name": "Hotel Eiffel Seine", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.237288135593221", "stars": "0", "max_price": "297", "min_price": "141", "ref": "572984", "review": "Driectly next to 2 amazing cafes and literally only a 4 minute walk to the Eiffel Tower."}
{"name": "Hotel Eiffel Seine", "detailed_city": "Europe | France | Ile-de-France | Paris", "review_rating": "4.237288135593221", "stars": "0", "max_price": "297", "min_price": "141", "ref": "572984", "review": "Only a few blocks from Eiffel tower and about a short block from river Seine."}
:
If not, you can --slurp
(or -s
) the JSON stream, sort_by
the .name
field, and use the --compact-output
(or -c
) format.
jq -sc 'sort_by(.name)[]' file.json
{"name":"Hotel Eiffel Petit Louvre","detailed_city":"Europe | France | Ile-de-France | Paris","review_rating":"3.870967741935484","stars":"2.5","max_price":"324","min_price":"117","ref":"208100","review":"Within walking distance of the Eiffel Tower."}
{"name":"Hotel Eiffel Petit Louvre","detailed_city":"Europe | France | Ile-de-France | Paris","review_rating":"3.870967741935484","stars":"2.5","max_price":"324","min_price":"117","ref":"208100","review":"Metro station is literally across the road."}
{"name":"Hotel Eiffel Seine","detailed_city":"Europe | France | Ile-de-France | Paris","review_rating":"4.237288135593221","stars":"0","max_price":"297","min_price":"141","ref":"572984","review":"Driectly next to 2 amazing cafes and literally only a 4 minute walk to the Eiffel Tower."}
{"name":"Hotel Eiffel Seine","detailed_city":"Europe | France | Ile-de-France | Paris","review_rating":"4.237288135593221","stars":"0","max_price":"297","min_price":"141","ref":"572984","review":"Only a few blocks from Eiffel tower and about a short block from river Seine."}
{"name":"Hotel Galileo","detailed_city":"Europe | France | Ile-de-France | Paris","review_rating":"4.5396825396825395","stars":"3","max_price":"599","min_price":"90","ref":"197576","review":"Within walking distance to the Eiffel Tower and many other attractions."}
{"name":"Hotel Galileo","detailed_city":"Europe | France | Ile-de-France | Paris","review_rating":"4.5396825396825395","stars":"3","max_price":"599","min_price":"90","ref":"197576","review":"It’s a small hotel near Champs-Elysées!!!"}
:
Upvotes: 0