Cy.
Cy.

Reputation: 2145

Cleanup data on one long line in Mathematica

I have a file of 12Mb approximately that has the following typology

[["1",-154],["2",-100],["3",-28],["4",-66],["5",-222],["6",-309],["7",-196],["8",-50],["9",-53],["10",-209],["11",-355],["12",-350],["13",-269],["14",-264],["15",-392],["16",-513],["17",-515],["18",-434],["19",-418],["20",-505],["21",-592],["22",-559],["23",-422],["24",-384],["25",-539],["26",-716],["27",-713],["28",-593],["29",-534],["30",-647],["31",-813],["32",-857],["33",-711],["34",-582],["35",-594],["36",-700],["37",-721],["38",-600],["39",-487],["40",-490],["41",-589],["42",-630],["43",-502],["44",-365],["45",-340],["46",-403],["47",-420],["48",-291],["49",-136],["50",-98],["51",-218],["52",-285],["53",-198],["54",-52],["55",-58],["56",-213],["57",-334],["58",-301],["59",-195],["60",-195],["61",-324],["62",-470],["63",-465],["64",-378],["65",-381],["66",-546],["67",-734],["68",-767],["69",-695],["70",-683],["71",-804],["72",-991],["73",-1050],["74",-937],["75",-850],["76",-912],["77",-1041],["78",-1065],["79",-972],["80",-931],["81",-1030],["82",-1186],["83",-1233],["84",-1113],["85",-992],["86",-1051],["87",-1206],["88",-1299],["89",-1218],["90",-1112],["91",-1150],["92",-1287],["93",-1345],["94",-1239],["95",-1140],["96",-1147],["97",-1276],["98",-1363],["99",-1312],["100",-1206],["101",-1184],["102",-1297],["103",-1378],["104",-1297],["105",-1141],["106",-1113],["107",-1219],["108",-1325],["109",-1284],["110",-1147],["111",-1103],["112",-1179],["113",-1300],["114",-1262],["115",-1141],

I'd like, using Mathematica, to clean it up removing all the symbols and numbers between quotes just leaving them separated by a space in the following format:

-154 -100 -28 -66 -222 -309 -196 etc…

How could I do this? I am fairly new to Mathematica and the tutorials on "How to clean a HTML file" or "How to clean a ZIP file" didn't clarified my question very much.

Upvotes: 3

Views: 147

Answers (3)

Mr.Wizard
Mr.Wizard

Reputation: 24336

Here is another method that avoids ToExpression (which could theoretically run code you did not intend to):

Import["data.txt", "Text"];

StringSplit[%, {"[[", "],[", "]]", ","}][[2 ;; ;; 2]];

StringJoin[Riffle[%, " "]]

Export["out.dat", %, "Text"]

Upvotes: 2

Mike Bailey
Mike Bailey

Reputation: 12817

You can try importing it as a string, replacing [ with { and ] with }, Evaling it then stripping out the first element of each Tuple with a Last@Tranpose.

data = Import["your_data.txt"];
Last@Transpose@
  ToExpression[StringReplace[data, {"[" -> "{", "]" -> "}"}]]

Of course, there are much nicer ways of doing this. Slater's idea works well as well. You'll find there are literally a million ways to do this sort of thing in Mathematica.

Upvotes: 3

Slater Victoroff
Slater Victoroff

Reputation: 21914

Mathematica does have regex support, as well as a general string manipulation package. Something along the lines of:

string = "[["1",-154],["2",-100],["3",-28],["4",-66]]"
StringSplit[string, "],["]

StringReplace[strings, RegularExpression["[\"[0-9]+\"]] -> " "]

You might need to play around with that a little, but that's the idea.

Upvotes: 3

Related Questions