Reputation: 1
I have a string like the one mentioned below.
{
behavior_capture_widget: "<div id="ndwc"><noscript><input type="hidden" id="ndpd-spbd" name="ndpd-spbd" value="ndpds~~~2.4177.112540pPaGJYVmt5WCtndGlUiUcRt3aSOPQ,,"></noscript></div> <script type="text/javascript"></script>"
customer_session_data: "2.4177.112540.1399312572.2.mFDzrW_JJeu-C_H45O5ADQ"
customer_cookie_data: "2.4177.112540.1399312572.2.XYjAsjFsOVHFXBGNnnHc-g,,."
}
I will always get the string in this format. Values may vary.
I have to extract the value of
behavior_capture_widget
customer_session_data
customer_cookie_data
in variables a,b,c
I m new in c# and I tried to use the combination of substring()
, indexof()
but for no avail
Help would be appreciated.
Thanks in advance.
Upvotes: 0
Views: 89
Reputation: 29926
If you do not know the name of the keys in the key: value
pairs before execution, or you do not know the order of the expected keys, you could use a regular expression and a dictionary to store any key-value pair:
class StrangeParser
{
public static readonly Regex LINE_REGEX = new Regex("^\\s*([a-zA-Z0-9_]+)\\s*\\:\\s*\"(.*)\"\\s*$", RegexOptions.Multiline);
public static Dictionary<string, string> ParseStr(string str)
{
var m = LINE_REGEX.Matches(str);
var res = new Dictionary<string, string>();
foreach (Match item in m)
{
res.Add(item.Groups[1].Value, item.Groups[2].Value);
}
return res;
}
static void Main(string[] args)
{
foreach (var item in ParseStr(@"
{
behavior_capture_widget: ""<div id=""ndwc""><noscript><input type=""hidden"" id=""ndpd-spbd"" name=""ndpd-spbd"" value=""ndpds~~~2.4177.112540pPaGJYVmt5WCtndGlUiUcRt3aSOPQ,,""></noscript></div> <script type=""text/javascript""></script>""
customer_session_data: ""2.4177.112540.1399312572.2.mFDzrW_JJeu-C_H45O5ADQ""
customer_cookie_data: ""2.4177.112540.1399312572.2.XYjAsjFsOVHFXBGNnnHc-g,,.""
}
"))
{
Console.WriteLine(item.Key + " " + item.Value);
}
}
}
Upvotes: 1
Reputation: 68710
If you're going to be working with this format extensively, it may be worth it to design a parser for this syntax.
This can be done using parser combinators to define a syntax tree.
Using Sprache, I built this parser that gives you a Dictionary<string, string>
:
var key = Parse.CharExcept(c => char.IsWhiteSpace(c) || c == ':', "").Many().Text();
var value =
from leading in Parse.Char('"')
from val in Parse.AnyChar.Until(Parse.Char('"').Then(_ => Parse.LineEnd)).Text()
select val;
var separator =
from x in Parse.WhiteSpace.Many()
from colon in Parse.Char(':')
from y in Parse.WhiteSpace.Many()
select colon;
var keyValue =
from k in key
from s in separator
from v in value
select new KeyValuePair<string, string>(k, v);
var parser =
from open in Parse.Char('{').Then(_ => Parse.WhiteSpace.Many())
from kvs in keyValue.DelimitedBy(Parse.WhiteSpace.Many())
from close in Parse.WhiteSpace.Many().Then(_ => Parse.Char('}'))
select kvs.ToDictionary(kv => kv.Key, kv => kv.Value);
You can now use this parser like this:
var dictionary = parser.Parse(input);
Upvotes: 0
Reputation: 3571
If that is JSON always. I believe it is better to use "Json.NET":
string json = @"{
{
behavior_capture_widget: "<div id=\"ndwc\"><noscript><input type=\"hidden\" id=\"ndpd-spbd" name=\"ndpd-spbd\" value=\"ndpds~~~2.4177.112540pPaGJYVmt5WCtndGlUiUcRt3aSOPQ,,\"></noscript></div> <script type=\"text/javascript\"></script>"
customer_session_data: "2.4177.112540.1399312572.2.mFDzrW_JJeu-C_H45O5ADQ"
customer_cookie_data: "2.4177.112540.1399312572.2.XYjAsjFsOVHFXBGNnnHc-g,,."
}";
Dictionary<string, string> dict = JsonConvert.DeserializeObject<Dictionary<string, string>>(json);
var mydata = dict["customer_cookie_data"]; // extraction
See at here
Upvotes: 0
Reputation: 119017
Instead of using substring, you could use string.Split to get you most of the way there:
var input = "... your data ...";
var valueNames = new []
{
"behavior_capture_widget:",
"customer_session_data:",
"customer_cookie_data:"
};
var items = input.Split(valueNames, StringSplitOptions.RemoveEmptyEntries);
Now you can extract your values:
var behaviorCaptureWidget = items[1].Trim();
var customerSessionData= items[2].Trim();
var customerCookieData= items[3].Trim().Replace("\"}", "");
Note the last one needs the trailing "}
to be manually removed.
Upvotes: 1