我想将 XML Like字符串拆分为c#或sql中的标记. 例如 输入字符串就像 entryAUTHORC. Qiao/AUTHOR and AUTHORR.Melhem/AUTHOR, "TITLEReducing Communication /TITLE",DATE1995/DATE. /entry 我想要这个输出: C AUTHOR. AUTHOR
例如
输入字符串就像
<entry><AUTHOR>C. Qiao</AUTHOR> and <AUTHOR>R.Melhem</AUTHOR>, "<TITLE>Reducing Communication </TITLE>",<DATE>1995</DATE>. </entry>
我想要这个输出:
C AUTHOR . AUTHOR Qiao AUTHOR and R AUTHOR . AUTHOR Melhem AUTHOR , " Reducing TITLE Communication TITLE " , 1995 DATE .考虑到以下因素,这是如何解决此问题的第一次尝试:
1. XML String是有效的(即标签之间不会有任何无效的字符)
像这样:
string xml = @"<ENTRY><AUTHOR>C. Qiao</AUTHOR> <AUTHOR>R.Melhem</AUTHOR> <TITLE>Reducing Communication </TITLE> <DATE>1995</DATE> </ENTRY>";
2.分裂将由空间”完成
string xml = @"<ENTRY><AUTHOR>C. Qiao</AUTHOR> <AUTHOR>R.Melhem</AUTHOR> <TITLE>Reducing Communication </TITLE> <DATE>1995</DATE> </ENTRY>"; XElement doc = XElement.Parse(xml); foreach (XElement element in doc.Elements()) { var values = element.Value.Split(' '); foreach (string value in values) { Console.WriteLine(element.Name + " " + value); } }
将打印出来
AUTHOR C. AUTHOR Qiao AUTHOR R.Melhem TITLE Reducing TITLE Communication TITLE DATE 1995
编辑:
现在,基于“.”进行拆分.和空格,最好的想法是使用正则表达式.像这样:
var values = Regex.Split(element.Value, @"(\.| )"); foreach (string value in values.Where(x=>!String.IsNullOrWhiteSpace(x))) { Console.WriteLine(element.Name + " " + value); }
如果您愿意,可以添加更多分隔符.以下示例将为您提供以下内容:
AUTHOR C AUTHOR . AUTHOR Qiao AUTHOR R AUTHOR . AUTHOR Melhem TITLE Reducing TITLE Communication DATE 1995
EDIT2:
这是一个与原始字符串一起使用的示例,它很可能不是最好的方法,因为它没有正确的令牌顺序,但它应该非常接近:
string xml = @" <entry> <AUTHOR>C. Qiao</AUTHOR> and <AUTHOR>R.Melhem</AUTHOR>, ""<TITLE>Reducing Communication </TITLE>"" ,<DATE>1995</DATE>. </entry>"; //Parse xml to XDocument XDocument doc = XDocument.Parse(xml); // Get first element (we only have one) XElement element = doc.Descendants().FirstOrDefault(); //Create a copy of an element for use by child elements. XElement copyElement = new XElement(element); //Remove all child nodes from root leaving only text element.Elements().Remove(); //Splitting based on the tokens specified var values = Regex.Split(element.Value, @"(\.| |\,|\"")"); foreach (string value in values.Where(x => !String.IsNullOrWhiteSpace(x))) { Console.WriteLine(value); } //Getting children nodes and splitting the same way foreach (XElement elem in copyElement.Elements()) { var val = Regex.Split(elem.Value, @"(\.| |\,|\"")"); foreach (string value in val.Where(x => !String.IsNullOrWhiteSpace(x))) { Console.WriteLine(value + " " + elem.Name); } } //You can try to play with DescendantsAndSelf //to see if you can do it in single action and with order preserved. //foreach (XElement elem in element.DescendantsAndSelf()) //{ // //.... //}
这将打印出以下内容:
and , " " , . C AUTHOR . AUTHOR Qiao AUTHOR R AUTHOR . AUTHOR Melhem AUTHOR Reducing TITLE Communication TITLE 1995 DATE