2016-06-10 23 views
0

Ich arbeite an einem Werkzeug und ich bin in meinen letzten Schritten, aber ich stelle ein kleines Problem, werde schätzen, ist, können Sie mir einen Hinweis geben. Ich habe diese 3 Tabellen, ich kann die Daten nur von den ersten 2 bekommen, wie kann ich die dritte erreichen, wo es geschrieben wird Upgrade Garantie und Service Information?Erhalte Daten von der dritten Tabelle in HTML

ist die Tabellen-Code:

<body> 
 
\t \t <div id="ibm-pcon"> 
 
\t \t \t <div id="ibm-content"> 
 
\t \t \t \t <div id="ibm-leadspace-head" class="ibm-alternate"> 
 
\t \t \t \t \t <div id="ibm-leadspace-body"> 
 
\t \t \t \t \t \t <br></br> 
 
\t \t \t \t \t \t <script type="text/javascript">currentDate();</script> 
 
\t \t \t \t \t \t <br></br> 
 
\t \t \t \t \t \t 
 
\t \t \t \t \t \t \t <!--BEGIN OPTIONAL BREADCRUMBING--> <span style="font-size: small;"><a href="/pc/entitle/pg2/Service.wss/display/MachineHome">Machine Lookup</a> &gt; <a href="/pc/entitle/pg2/Service.wss/mts/Lookup">Warranty Information</a> &gt; </span> 
 
\t \t \t \t \t \t \t <!--END OPTIONAL BREADCRUMBING--> 
 
\t \t \t \t \t \t 
 
\t \t \t \t \t \t <br></br> 
 
\t \t \t \t \t \t <h1>PEW | Warranty Information</h1> \t \t \t \t 
 
\t \t \t \t \t </div> 
 
\t \t \t \t </div> 
 
\t \t \t \t <!-- CONTENT_BODY --> 
 
\t \t \t \t <div id="ibm-content-body"> 
 
\t \t \t \t \t <div id="ibm-content-main"> 
 
\t \t \t \t \t <!-- LEADSPACE_BEGIN --> \t \t \t \t 
 
\t \t \t \t \t \t \t \t 
 
\t \t \t \t \t \t 
 
\t \t <!-- This section can be used to test JavaScript and CSS before promoting the data to the template XML. --> 
 
\t \t <table class="ibm-results-table" summary="output table" cellpadding="0" cellspacing="0" border="0"><tbody xmlns="http://www.w3.org/TR/xhtml1/"> 
 
<thead> 
 
<tr> 
 
<th scope="col" class="pg2OutputTableSectionTitle">Results of Machine Type/Serial Number Query</th> 
 
</tr> 
 
</thead> 
 
<tr> 
 
<td><table class="ibm-data-table ibm-alternating" summary="output table" cellpadding="0" cellspacing="0" border="0"><tbody> 
 
<thead> 
 
<tr> 
 
<th scope="col" colspan="3" class="pg2TableSectionTitle">General Machine Information:</th> 
 
</tr> 
 
</thead> 
 
<tr> 
 
<td> 
 
        Type: 
 
        <span>1746</span> 
 
</td><td> 
 
        Model: 
 
        <span>C4A</span> 
 
</td><td> 
 
        Serial: 
 
        <span>13D06MK</span> 
 
</td> 
 
</tr> 
 
<tr> 
 
<td> 
 
        Status: 
 
        <span>Proof Of Purchase Rcvd</span> 
 
</td><td> 
 
         Build Date: 
 
         <span>&nbsp;</span> 
 
</td><td> 
 
         Build to Model: 
 
         <span> </span> 
 
</td> 
 
</tr> 
 
<tr> 
 
<td> 
 
         Geography: 
 
         <span>EMEA</span> 
 
</td><td> 
 
         Country: 
 
         <span>GREECE</span> 
 
</td><td> 
 
         Configuration Id: 
 
         <span>&nbsp;</span> 
 
</td> 
 
</tr> 
 
<tr> 
 
<td> 
 
         OES Order Number: 
 
         <span>2076804957</span> 
 
</td><td> 
 
         Customer Number: 
 
         <span>108401</span> 
 
</td><td> 
 
         Delivery Number: 
 
         <span>8519501492</span> 
 
</td> 
 
</tr> 
 
<tr> 
 
<td colspan="2"> 
 
            Service Status: 
 
            <span>This machine is currently out of warranty.</span> 
 
</td><td colspan="1"> 
 
            UAR End Date: 
 
            <span>2012-08-02</span> 
 
</td> 
 
</tr> 
 
</tbody></table></td> 
 
</tr> 
 
<tr> 
 
<td><table class="ibm-data-table ibm-alternating" summary="output table" cellpadding="0" cellspacing="0" border="0"><tbody> 
 
<thead> 
 
<tr> 
 
<th scope="col" colspan="3" class="pg2TableSectionTitle">Warranty and Service Information:</th> 
 
</tr> 
 
</thead> 
 
<tr> 
 
<th scope="col">Start Date</th><th scope="col">End Date</th><th scope="col">SDF</th> 
 
</tr> 
 
<tr> 
 
<td>2012-07-04</td><td>2015-07-03</td><td>3XL</td> 
 
</tr> 
 
<tr> 
 
<td colspan="3"> 
 
        SDF Description: 
 
        <span>This product has a 3 year limited warranty and is entitled to CRU (customer replaceable unit) and On-site service. Tier 1 CRUs are customer responsibility, see announcement for details. On-site Service is available Monday - Friday, except holidays, with a next business day response objective.</span> 
 
</td> 
 
</tr> 
 
</tbody></table></td> 
 
</tr> 
 
<tr> 
 
<td><table class="ibm-data-table ibm-alternating" summary="output table" cellpadding="0" cellspacing="0" border="0"><tbody> 
 
<thead> 
 
<tr> 
 
<th scope="col" colspan="3" class="pg2TableSectionTitle">Upgrade Warranty and Service Information:</th> 
 
</tr> 
 
</thead> 
 
<tr> 
 
<th scope="col">Start Date</th><th scope="col">End Date</th><th scope="col">SDF</th> 
 
</tr> 
 
<tr> 
 
<td>2012-07-04</td><td>2015-07-03</td><td>SP4</td> 
 
</tr> 
 
<tr> 
 
<td colspan="3"> 
 
        SDF Description: 
 
        <span>This product has a three year limited warranty which includes a warranty upgrade. This product is entitled to parts and labor and includes on-site repair service. Service is available 7X24 with an 4 hour response objective.</span> 
 
</td> 
 
</tr> 
 
</tbody></table></td> 
 
</tr> 
 
<tr> 
 
<td><table class="ibm-data-table" cellpadding="0" cellspacing="0" border="0"><thead> 
 
<tr> 
 
<th scope="col" class="pg2MessageHead">Messages</th> 
 
</tr> 
 
</thead> 
 
<tbody> 
 
<tr> 
 
<td class="pg2MessagePanel" align="left">&nbsp;</td> 
 
</tr> 
 
</tbody></table></td> 
 
</tr> 
 
</tbody></table> 
 
\t \t 
 
\t \t \t \t \t </div>

Mein Arbeitscode lautet:

  public void actionPerformed(ActionEvent e) {     
       try { 
        String getTextArea; 
        getTextArea = textArea.getText(); 
        String[] arr = getTextArea.split("\\n"); 
        String type = null; 
        String serial = null; 
        int line = 0; 
        for(String s : arr) { 

         line++; 
         if(s.isEmpty()) { 
          textArea_1.append("Empty Line" + '\n'); 
          continue; 
         } 

         type = s.substring(0, 4); 
         serial = s.substring(5, 12); 
         String html = "bla bla bla + type + serial; 

         Document doc = Jsoup.connect(html).get(); 
         Elements tableElements = doc.select("table"); 
         java.util.Iterator<Element> ite = tableElements.select("tr").iterator(); 
         Elements tableElement = doc.select("tr"); 
         java.util.Iterator<Element> ite1 = tableElement.select("table").iterator(); 
         ite.next(); 
         ite1.next(); 

         String result,result1,result2; 
         result = ite.next().text(); 
         result1 = ite1.next().text(); 

         Scanner sr = new Scanner(result); 
         Scanner sr1 = new Scanner(result1); 

//      System.out.println(result); 
//      System.out.println(result1); 

         // result of first table 
         while(sr.hasNext()) { 
          result = result; 
          ite.next().text(); 
          String lineOfType; 
          lineOfType = ite.next().text(); 
          type = lineOfType.substring(6, 10); 
          String model; 
          model = lineOfType.substring(18, 21); 
          serial = lineOfType.substring(30, 37); 
          ite.next().text(); 
          String country = ite.next().text(); 
          country = country.substring(24, 31); 
          textArea_1.append(line + "-" + type + '\t' + model + '\t' + serial + " " + country + " "); 
         } 

         sr.close(); 

         // result of secind table 

         while(sr1.hasNext()) { 
          result1 = result1; 
          String startDate = result1.substring(58, 68); 
          String endDate = result1.substring(69, 79); 
          textArea_1.append(startDate + " " + endDate + " "); 
          break; 
         } 

         sr1.close(); 

         // getting the elements for the 3rd table, but not working as expected, it gets the secnd table data. 

         Elements tableElement2 = doc.select("tr"); 
         java.util.Iterator<Element> ite2 = tableElement2.select("table").iterator(); 
         ite2.next(); 
         result2 = ite2.next().text(); 
         Scanner sr2 = new Scanner(result2); 


         // this while shows the same result as the second while ! 
         while(sr2.hasNext()) { 
          sr2.next(); 
          result2 = result2; 
          System.out.println(result2); 
          String srvPkStart = result2.substring(58, 68); 
          if(srvPkStart.equals(result1.substring(58, 68))) { 
           srvPkStart = "Not found"; 
          } 
          String srvPkEnd = result2.substring(69, 79); 
          if(srvPkEnd.equals(result1.substring(69, 79))) { 
           srvPkEnd = ""; 
          } 
          System.out.println(srvPkStart + '\t' + srvPkEnd); 
          textArea_1.append("ServicePack Dates: " + srvPkStart + '\t' + srvPkEnd + '\n'); 
          break; 
         } 



        } // end of for loop  
       } catch (Exception e2) { 
        // TODO: handle exception 
       } 
      } 
     }); 

Antwort

1

Lassen Sie uns sagen, eine andere einfacher Weg, um die Tabellen zu erhalten ändern. Ich würde vorschlagen, Tabellen nach Klasse zu bekommen mit org.jsoup.nodes.Element.select().

Überprüfen Sie diese link, um zu lernen, wie man jsoup-selector-syntax verwendet, um Elemente zu erhalten.

String html = "<body><div id=\"ibm-pcon\"><div id=\"ibm-content\"><div id=\"ibm-leadspace-head\" class=\"ibm-alternate\"><div id=\"ibm-leadspace-body\"><br></br><script type=\"text/javascript\">currentDate();</script><br></br><!--BEGIN OPTIONAL BREADCRUMBING--> <span style=\"font-size: small;\"><a href=\"/pc/entitle/pg2/Service.wss/display/MachineHome\">Machine Lookup</a> &gt; <a href=\"/pc/entitle/pg2/Service.wss/mts/Lookup\">Warranty Information</a> &gt; </span><!--END OPTIONAL BREADCRUMBING--><br></br><h1>PEW | Warranty Information</h1> </div></div><!-- CONTENT_BODY --><div id=\"ibm-content-body\"><div id=\"ibm-content-main\"><table class=\"ibm-results-table\" summary=\"output table\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\"><tbody xmlns=\"www.w3.org/TR/xhtml1/\"><thead> <tr><th scope=\"col\" class=\"pg2OutputTableSectionTitle\">Results of Machine Type/Serial Number Query</th> </tr></thead><tr> <td><table class=\"ibm-data-table ibm-alternating\" summary=\"output table\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\"> <tbody> <thead><tr> <th scope=\"col\" colspan=\"3\" class=\"pg2TableSectionTitle\">General Machine Information:</th></tr> </thead> <tr><td> Type: <span>1746</span></td><td> Model: <span>C4A</span></td><td> Serial: <span>13D06MK</span></td> </tr> <tr><td> Status: <span>Proof Of Purchase Rcvd</span></td><td> Build Date: <span>&nbsp;</span></td><td> Build to Model: <span> </span></td> </tr> <tr><td> Geography: <span>EMEA</span></td><td> Country: <span>GREECE</span></td><td> Configuration Id: <span>&nbsp;</span></td> </tr> <tr><td> OES Order Number: <span>2076804957</span></td><td> Customer Number: <span>108401</span></td><td> Delivery Number: <span>8519501492</span></td> </tr> <tr><td colspan=\"2\"> Service Status: <span>This machine is currently out of warranty.</span></td><td colspan=\"1\"> UAR End Date: <span>2012-08-02</span></td> </tr> </tbody></table> </td></tr><tr> <td><table class=\"ibm-data-table ibm-alternating\" summary=\"output table\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\"> <tbody> <thead><tr> <th scope=\"col\" colspan=\"3\" class=\"pg2TableSectionTitle\">Warranty and Service Information:</th></tr> </thead> <tr><th scope=\"col\">Start Date</th><th scope=\"col\">End Date</th><th scope=\"col\">SDF</th> </tr> <tr><td>2012-07-04</td><td>2015-07-03</td><td>3XL</td> </tr> <tr><td colspan=\"3\"> SDF Description: <span>This product has a 3 year limited warranty and is entitled to CRU (customer replaceable unit) and On-site service. Tier 1 CRUs are customer responsibility, see announcement for details. On-site Service is available Monday - Friday, except holidays, with a next business day response objective.</span></td> </tr> </tbody></table> </td></tr><tr> <td><table class=\"ibm-data-table ibm-alternating\" summary=\"output table\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\"> <tbody> <thead><tr> <th scope=\"col\" colspan=\"3\" class=\"pg2TableSectionTitle\">Upgrade Warranty and Service Information:</th></tr> </thead> <tr><th scope=\"col\">Start Date</th><th scope=\"col\">End Date</th><th scope=\"col\">SDF</th> </tr> <tr><td>2012-07-04</td><td>2015-07-03</td><td>SP4</td> </tr> <tr><td colspan=\"3\"> SDF Description: <span>This product has a three year limited warranty which includes a warranty upgrade. This product is entitled to parts and labor and includes on-site repair service.Service is available 7X24 with an 4 hour response objective.</span></td> </tr> </tbody></table> </td></tr><tr> <td><table class=\"ibm-data-table\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\"> <thead><tr> <th scope=\"col\" class=\"pg2MessageHead\">Messages</th></tr> </thead> <tbody><tr> <td class=\"pg2MessagePanel\" align=\"left\">&nbsp;</td></tr> </tbody></table> </td></tr></tbody> </table></div> </body>"; 
    Document doc = Jsoup.parse(html, "", Parser.xmlParser()); 
    Elements tables = doc.select("table.ibm-data-table.ibm-alternating"); // Get table which has classes = ibm-data-table, ibm-alternating 

    System.out.println(tables.size()); // tables.size = 3 

    for (Element ele: tables) { 
     // Get table header 
     Elements thElements = ele.select("tr > th.pg2TableSectionTitle"); // Get tableheader has classes = pg2TableSectionTitle 

     if (thElements != null && thElements.size() > 0) { 
      String tableTitle = thElements.get(0).text(); 
      System.out.println(tableTitle); 

      if (tableTitle.contains("General Machine Information:")) { 
       // Apply your logic accordingly for table #General Machine 
      } 
      else if (tableTitle.contains("Warranty and Service Information:")) { 
       // Apply your logic accordingly for table #Warranty and Service 
      } 
      else if (tableTitle.contains("Upgrade Warranty and Service Information:")) { 
       // Apply your logic accordingly for table #Upgrade Warranty 
      } 
     } 
    } 
+0

dies ist eine nette Idee, aber leider ist es nicht funktioniert, weil mein html nicht, wie Sie zeigen, ich muss zuerst die Daten auf einen Text in meinem Werkzeug abgelegt hinzufügen und dann so drücken laufen, dass für Alle Daten werden mir das erwartete Ergebnis bringen, denn wenn ich jetzt deinen Code ausprobiere gibt es mir 0 Größe für die Tabellengröße und dann wird nichts anderes gedruckt! die Website, die ich verwende, ist so etwas wie http://w3-01.ibm.com/pc/entitle/pg2/Service.wss/mts/Lookup?type=12345&serial=123456789 –

+0

Ich habe versucht, das Dokument zu ändern und es funktioniert jetzt: D Dokument doc = Jsoup.connect (html) .get(); –

+0

@AboelmagdSaad Die HTML, die ich verwendet habe, ist genau das, was Sie zur Verfügung gestellt haben. Ich nehme an, dass Sie Jsoup.connect() verwenden, um die HTML-Quelle zu erhalten. Und solange die zurückgegebene HTML-Quelle der in Ihrer Frage angegebenen entspricht, funktioniert mein Code/Jsoup. Um die erhaltene HTML-Quelle zu überprüfen, rufen Sie Document.html() auf. –