Reputation: 2145
I have a file of 12Mb approximately that has the following typology
[["1",-154],["2",-100],["3",-28],["4",-66],["5",-222],["6",-309],["7",-196],["8",-50],["9",-53],["10",-209],["11",-355],["12",-350],["13",-269],["14",-264],["15",-392],["16",-513],["17",-515],["18",-434],["19",-418],["20",-505],["21",-592],["22",-559],["23",-422],["24",-384],["25",-539],["26",-716],["27",-713],["28",-593],["29",-534],["30",-647],["31",-813],["32",-857],["33",-711],["34",-582],["35",-594],["36",-700],["37",-721],["38",-600],["39",-487],["40",-490],["41",-589],["42",-630],["43",-502],["44",-365],["45",-340],["46",-403],["47",-420],["48",-291],["49",-136],["50",-98],["51",-218],["52",-285],["53",-198],["54",-52],["55",-58],["56",-213],["57",-334],["58",-301],["59",-195],["60",-195],["61",-324],["62",-470],["63",-465],["64",-378],["65",-381],["66",-546],["67",-734],["68",-767],["69",-695],["70",-683],["71",-804],["72",-991],["73",-1050],["74",-937],["75",-850],["76",-912],["77",-1041],["78",-1065],["79",-972],["80",-931],["81",-1030],["82",-1186],["83",-1233],["84",-1113],["85",-992],["86",-1051],["87",-1206],["88",-1299],["89",-1218],["90",-1112],["91",-1150],["92",-1287],["93",-1345],["94",-1239],["95",-1140],["96",-1147],["97",-1276],["98",-1363],["99",-1312],["100",-1206],["101",-1184],["102",-1297],["103",-1378],["104",-1297],["105",-1141],["106",-1113],["107",-1219],["108",-1325],["109",-1284],["110",-1147],["111",-1103],["112",-1179],["113",-1300],["114",-1262],["115",-1141],
I'd like, using Mathematica, to clean it up removing all the symbols and numbers between quotes just leaving them separated by a space in the following format:
-154 -100 -28 -66 -222 -309 -196 etc…
How could I do this? I am fairly new to Mathematica and the tutorials on "How to clean a HTML file" or "How to clean a ZIP file" didn't clarified my question very much.
Upvotes: 3
Views: 147
Reputation: 24336
Here is another method that avoids ToExpression
(which could theoretically run code you did not intend to):
Import["data.txt", "Text"];
StringSplit[%, {"[[", "],[", "]]", ","}][[2 ;; ;; 2]];
StringJoin[Riffle[%, " "]]
Export["out.dat", %, "Text"]
Upvotes: 2
Reputation: 12817
You can try importing it as a string, replacing [
with {
and ]
with }
, Eval
ing it then stripping out the first element of each Tuple with a Last@Tranpose
.
data = Import["your_data.txt"];
Last@Transpose@
ToExpression[StringReplace[data, {"[" -> "{", "]" -> "}"}]]
Of course, there are much nicer ways of doing this. Slater's idea works well as well. You'll find there are literally a million ways to do this sort of thing in Mathematica.
Upvotes: 3
Reputation: 21914
Mathematica does have regex support, as well as a general string manipulation package. Something along the lines of:
string = "[["1",-154],["2",-100],["3",-28],["4",-66]]"
StringSplit[string, "],["]
StringReplace[strings, RegularExpression["[\"[0-9]+\"]] -> " "]
You might need to play around with that a little, but that's the idea.
Upvotes: 3