当前位置 : 主页 > 网页制作 > xml >

我想将XML Like字符串拆分为c#或sql中的标记

来源:互联网 收集:自由互联 发布时间:2021-06-13
我想将 XML Like字符串拆分为c#或sql中的标记. 例如 输入字符串就像 entryAUTHORC. Qiao/AUTHOR and AUTHORR.Melhem/AUTHOR, "TITLEReducing Communication /TITLE",DATE1995/DATE. /entry 我想要这个输出: C AUTHOR. AUTHOR
我想将 XML Like字符串拆分为c#或sql中的标记.
例如
输入字符串就像

<entry><AUTHOR>C. Qiao</AUTHOR> and <AUTHOR>R.Melhem</AUTHOR>, "<TITLE>Reducing Communication </TITLE>",<DATE>1995</DATE>. </entry>

我想要这个输出:

C       AUTHOR
.       AUTHOR
Qiao    AUTHOR
and 
R       AUTHOR
.       AUTHOR
Melhem  AUTHOR
,   
"
Reducing        TITLE
Communication   TITLE
"
,
1995    DATE
.
考虑到以下因素,这是如何解决此问题的第一次尝试:
1. XML String是有效的(即标签之间不会有任何无效的字符)
像这样:

string xml = @"<ENTRY><AUTHOR>C. Qiao</AUTHOR>
                                  <AUTHOR>R.Melhem</AUTHOR>
                                  <TITLE>Reducing Communication </TITLE>
                                  <DATE>1995</DATE>
                           </ENTRY>";

2.分裂将由空间”完成

string xml = @"<ENTRY><AUTHOR>C. Qiao</AUTHOR>
                              <AUTHOR>R.Melhem</AUTHOR>
                              <TITLE>Reducing Communication </TITLE>
                              <DATE>1995</DATE>
                       </ENTRY>";
        XElement doc = XElement.Parse(xml);
        foreach (XElement element in doc.Elements())
        {

            var values = element.Value.Split(' ');
            foreach (string value in values)
            {
                Console.WriteLine(element.Name + " " + value);
            }
        }

将打印出来

AUTHOR C.
AUTHOR Qiao
AUTHOR R.Melhem
TITLE Reducing
TITLE Communication
TITLE
DATE 1995

编辑:

现在,基于“.”进行拆分.和空格,最好的想法是使用正则表达式.像这样:

var values = Regex.Split(element.Value, @"(\.| )");
        foreach (string value in values.Where(x=>!String.IsNullOrWhiteSpace(x)))
        {
            Console.WriteLine(element.Name + " " + value);
        }

如果您愿意,可以添加更多分隔符.以下示例将为您提供以下内容:

AUTHOR C
AUTHOR .
AUTHOR Qiao
AUTHOR R
AUTHOR .
AUTHOR Melhem
TITLE Reducing
TITLE Communication
DATE 1995

EDIT2:
这是一个与原始字符串一起使用的示例,它很可能不是最好的方法,因为它没有正确的令牌顺序,但它应该非常接近:

string xml = @" <entry>
                            <AUTHOR>C. Qiao</AUTHOR> 
                            and 
                            <AUTHOR>R.Melhem</AUTHOR>, 
                            ""<TITLE>Reducing Communication </TITLE>""
                           ,<DATE>1995</DATE>. 
                           </entry>";
            //Parse xml to XDocument
            XDocument doc = XDocument.Parse(xml);

            // Get first element (we only have one)
            XElement element = doc.Descendants().FirstOrDefault();

            //Create a copy of an element for use by child elements.
            XElement copyElement = new XElement(element);
            //Remove all child nodes from root leaving only text
            element.Elements().Remove();

            //Splitting based on the tokens specified
                var values = Regex.Split(element.Value, @"(\.| |\,|\"")");
                    foreach (string value in values.Where(x => !String.IsNullOrWhiteSpace(x)))
                    {
                        Console.WriteLine(value);
                    }
            //Getting children nodes and splitting the same way
            foreach (XElement elem in copyElement.Elements())
            {
                var val = Regex.Split(elem.Value, @"(\.| |\,|\"")");
                foreach (string value in val.Where(x => !String.IsNullOrWhiteSpace(x)))
                {
                    Console.WriteLine(value + " " + elem.Name);
                }
            }
            //You can try to play with DescendantsAndSelf 
            //to see if you can do it in single action and with order preserved.
            //foreach (XElement elem in element.DescendantsAndSelf())
            //{
            //    //....
            //}

这将打印出以下内容:

and
,
"
"
,
.
C AUTHOR
. AUTHOR
Qiao AUTHOR
R AUTHOR
. AUTHOR
Melhem AUTHOR
Reducing TITLE
Communication TITLE
1995 DATE
网友评论