2016-06-06 10 views
3

Ich versuche, 9gag Kommentar Abschnitt zu kratzen, um einige Sentiment-Analyse zu tun und den Beitrag als positiv oder negativ zu kennzeichnen. Das ultimative Ziel ist es, die Daten von Tausenden von Posts zu trainieren und die Stimmung des Posts basierend auf Comment count, post upvotes, den zehn besten zehn Upvotes und dem Titel des Beitrags vorherzusagen.Beautifulsoup, urllib2 und Anfragen haben nicht alle HTML-Tags von 9gag.com gefunden

Ich scrapped erfolgreich den heißen Abschnitt für Titel und upvotes, aber wenn es darum geht, Kommentare zu scrapen, zeigt der HTML-Parser die relevanten Tags nicht an. Ich habe verschiedene Bibliotheken wie BS4, Requests, Pattern, urllib1/2 ausprobiert. Ich habe sogar 'html.parser' anstelle von lxml ausprobiert.

Meine Frage ist 9gag Kommentar Abschnitt vom Kratzen beschränkt? Wenn nicht, gibt es einen Grund, warum einer der Parser nicht alle Tags bekommen kann?

Update # 2 Hier ist der Code, den ich Gebraucht-

url = URL("http://9gag.com/gag/a1Mzz1D") 
    req = requests.get(url) 
    soup = BeautifulSoup(req.text, 'html.parser') 
    soup.findAll("div", attrs={"class":"comment-embed"}) 

die Ausgabe wie und leer Listen- sieht []

+1

Können Sie die relevanten Teile Ihres Codes und deren Ausgabe posten? Wissen Sie auch, ob Sie eine Ausnahme oder nur eine leere Ausgabe erhalten? – dmcc

Antwort

2

Die Daten werden geladen g Reagieren Sie können jedoch ein wenig Parsing tun und alle Daten erhalten müssen Sie in json Format:

import requests 
from urlparse import urljoin 
import ast 

base = "http://9gag.com/" 

# these are the params to get the json. 
params = {"appId": "", 
      "url": "", 
      "count": "10", 
      "level": "2", 
      "order": "score", 
      "mentionMapping": "true", 
      "origin": "9gag.com"} 

js = "Request URL:http://comment-cdn.9gag.com/v1/cacheable/comment-list.json" 

with requests.session() as s: 
    r = s.get(base) 
    soup = BeautifulSoup(r.content,"lxml") 
    # links to each actual page. 
    links = [urljoin(base, a["href"]) for a in soup.select("a.badge-evt.point"")] 
    for link in links: 
     cont = s.get(link).content 
     soup = BeautifulSoup(cont,"lxml") 
     # the params are all in the script body 
     script = soup.find("script", text=re.compile('appId')).text 
     # convert to dict so we can pull what we need by key 
     data = ast.literal_eval(script[script.find("{"):script.rfind("}") + 1]) 
     params["appId"] = data["appId"] 
     params["url"] = data["url"] 
     page_json = s.get(js, params=params).json() 
     for dct in page_json["payload"]["comments"]: 
      print(dct) 

Wenn wir diesen Code nur mit der ersten URL zurück laufen, erhalten wir:

In [28]: with requests.session() as s: 
    ....:   r = s.get(base) 
    ....:   soup = BeautifulSoup(r.content,"lxml") 
    ....:   links = [urljoin(base, a["href"]) for a in soup.select("a.comment.badge-evt")][:1] 
    ....:   for link in links: 
    ....:     cont = s.get(link).content 
    ....:     soup = BeautifulSoup(cont,"lxml") 
    ....:     script = soup.find("script", text=re.compile('appId')).text 
    ....:     data = ast.literal_eval(script[script.find("{"):script.rfind("}") + 1]) 
    ....:     params["appId"] = data["appId"] 
    ....:     params["url"] = data["url"] 
    ....:     page_json = s.get(js, params=params).json() 
    ....:     for dct in page_json["payload"]["comments"]: 
    ....:       print(dct) 
    ....:    
{u'hasNext': True, u'dislikeCount': 0, u'text': u'This is so awkward to watch ... and funny', u'userId': u'u_13759018032623', u'likeCount': 343, u'orderKey': u'score_00000000004834_14651297124662', u'children': [{u'hasNext': False, u'dislikeCount': 0, u'text': u'@twistedpickle.and also fake.', u'userId': u'u_145548331532421082', u'likeCount': 26, u'children': [], u'isCollapsed': 0, u'mediaText': u'@twistedpickle.and also fake.', u'section': u'', u'mentionMapping': {u'@twistedpickle': u'aBL7q1'}, u'commentId': u'c_146513113612585611', u'type': u'text', u'status': 0, u'parent': u'c_146512971246623391', u'timestamp': 1465131136, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'savage_ali', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/34323189_100_45.jpg', u'timestamp': u'1455483315', u'userId': u'u_145548331532421082', u'hashedAccountId': u'anbN66n', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/savage_ali'}, u'accountId': u'34323189', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146513113612585611', u'level': 2, u'suppData': {}, u'richtext': u'@twistedpickle.and also fake.', u'childrenTotal': 0, u'isAnonymous': 0}], u'isCollapsed': 0, u'mediaText': u'This is so awkward to watch ... and funny', u'section': u'', u'mentionMapping': {u'dummy': u''}, u'commentId': u'c_146512971246623391', u'type': u'text', u'status': 0, u'parent': u'c_146493707813378457', u'timestamp': 1465129712, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'twistedpickle', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/1870095_100_1.jpg', u'timestamp': u'1375901803', u'userId': u'u_13759018032623', u'hashedAccountId': u'aBL7q1', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/twistedpickle'}, u'accountId': u'1870095', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146512971246623391', u'level': 1, u'suppData': {}, u'richtext': u'This is so awkward to watch ... and funny', u'childrenTotal': 19, u'isAnonymous': 0} 
{u'hasNext': True, u'dislikeCount': 0, u'text': u'Hahaha PANTURA', u'userId': u'u_143454521023534763', u'likeCount': 231, u'orderKey': u'score_00000000004076_14649387351969', u'children': [{u'hasNext': False, u'dislikeCount': 0, u'text': u'@deadfight nussittuna nukut paremmin', u'userId': u'u_141790386790069041', u'likeCount': 39, u'children': [], u'isCollapsed': 0, u'mediaText': u'@deadfight nussittuna nukut paremmin', u'section': u'', u'mentionMapping': {u'@deadfight': u'aYLgpy7'}, u'commentId': u'c_146513018381635287', u'type': u'text', u'status': 0, u'parent': u'c_146493873519691145', u'timestamp': 1465130183, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'lady_kappa', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/22251683_100_38.jpg', u'timestamp': u'1417903867', u'userId': u'u_141790386790069041', u'hashedAccountId': u'a5K8b5N', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/lady_kappa'}, u'accountId': u'22251683', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146513018381635287', u'level': 2, u'suppData': {}, u'richtext': u'@deadfight nussittuna nukut paremmin', u'childrenTotal': 0, u'isAnonymous': 0}], u'isCollapsed': 0, u'mediaText': u'Hahaha PANTURA', u'section': u'', u'mentionMapping': {u'dummy': u''}, u'commentId': u'c_146493873519691145', u'type': u'text', u'status': 0, u'parent': u'c_146493707813378457', u'timestamp': 1464938735, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'deadfight', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/27180133_100_2.jpg', u'timestamp': u'1434545210', u'userId': u'u_143454521023534763', u'hashedAccountId': u'aYLgpy7', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/deadfight'}, u'accountId': u'27180133', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146493873519691145', u'level': 1, u'suppData': {}, u'richtext': u'Hahaha PANTURA', u'childrenTotal': 16, u'isAnonymous': 0} 
{u'hasNext': True, u'dislikeCount': 0, u'text': u'http://i.memeful.com/media/post/oMJ28xM_700wa_0.gif', u'userId': u'u_141680114571912397', u'likeCount': 225, u'orderKey': u'score_00000000003373_14649381081078', u'children': [{u'hasNext': False, u'dislikeCount': 0, u'text': u'@shogun_ka_yo up you go', u'userId': u'u_144283683005248817', u'likeCount': 2, u'children': [], u'isCollapsed': 0, u'mediaText': u'@shogun_ka_yo up you go', u'section': u'', u'mentionMapping': {u'@shogun_ka_yo': u'aMQRLRW'}, u'commentId': u'c_146513150738658348', u'type': u'text', u'status': 0, u'parent': u'c_146493810810784782', u'timestamp': 1465131507, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'dergermanyball', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/29998985_100_29.jpg', u'timestamp': u'', u'userId': u'u_144283683005248817', u'hashedAccountId': u'a1dpXrY', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/dergermanyball'}, u'accountId': u'29998985', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146513150738658348', u'level': 2, u'suppData': {}, u'richtext': u'@shogun_ka_yo up you go', u'childrenTotal': 0, u'isAnonymous': 0}], u'isCollapsed': 0, u'mediaText': u'http://i.memeful.com/media/post/oMJ28xM_700wa_0.gif', u'section': u'', u'mentionMapping': {u'dummy': u''}, u'commentId': u'c_146493810810784782', u'type': u'media', u'status': 0, u'parent': u'c_146493707813378457', u'timestamp': 1464938108, u'embedMediaMeta': {u'embedImage': {u'type': u'ANIMATED', u'image': {u'url': u'http://img-comment-fun.9cache.com/media/287e9c03142644331422775855_700w_0.jpg', u'width': 400, u'height': 206}, u'animated': {u'url': u'http://img-comment-fun.9cache.com/media/287e9c03142644331422775855_700wa_0.gif', u'width': 400, u'height': 206}, u'video': {u'url': u'http://img-comment-fun.9cache.com/media/287e9c03142644331422775855_700wv_0.mp4', u'width': 400, u'height': 206}}}, u'user': {u'displayName': u'shogun_ka_yo', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/22391718_100_2.jpg', u'timestamp': u'1416801145', u'userId': u'u_141680114571912397', u'hashedAccountId': u'aMQRLRW', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/shogun_ka_yo'}, u'accountId': u'22391718', u'permissions': []}, u'isUrl': 1, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146493810810784782', u'level': 1, u'suppData': {}, u'richtext': u'[url]http://i.memeful.com/media/post/oMJ28xM_700wa_0.gif[/url]', u'childrenTotal': 4, u'isAnonymous': 0} 
{u'hasNext': True, u'dislikeCount': 0, u'text': u'Now imagine if the genders were reversed', u'userId': u'u_143552720523387146', u'likeCount': 179, u'orderKey': u'score_00000000003144_14651301155438', u'children': [{u'hasNext': False, u'dislikeCount': 0, u'text': u'@rednotash hush little one. You're making sense now', u'userId': u'u_141363015125977644', u'likeCount': 77, u'children': [], u'isCollapsed': 0, u'mediaText': u'@rednotash hush little one. You're making sense now', u'section': u'', u'mentionMapping': {u'@rednotash': u'aOv8RMy'}, u'commentId': u'c_146513114535963914', u'type': u'text', u'status': 0, u'parent': u'c_146513011554386056', u'timestamp': 1465131145, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'srslydude', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/default-avatar/1_59_100_v0.jpg', u'timestamp': u'1413630151', u'userId': u'u_141363015125977644', u'hashedAccountId': u'aYwvpZx', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/srslydude'}, u'accountId': u'21558777', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146513114535963914', u'level': 2, u'suppData': {}, u'richtext': u'@rednotash hush little one. You're making sense now', u'childrenTotal': 0, u'isAnonymous': 0}], u'isCollapsed': 0, u'mediaText': u'Now imagine if the genders were reversed', u'section': u'', u'mentionMapping': {u'dummy': u''}, u'commentId': u'c_146513011554386056', u'type': u'text', u'status': 0, u'parent': u'c_146493707813378457', u'timestamp': 1465130115, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'rednotash', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/27823975_100_5.jpg', u'timestamp': u'1435527205', u'userId': u'u_143552720523387146', u'hashedAccountId': u'aOv8RMy', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/rednotash'}, u'accountId': u'27823975', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146513011554386056', u'level': 1, u'suppData': {}, u'richtext': u'Now imagine if the genders were reversed', u'childrenTotal': 9, u'isAnonymous': 0} 
{u'hasNext': True, u'dislikeCount': 0, u'text': u'Never let your waif follow you? Well she wouldnt follow you if you werent a dickhead. Women have the sixth sense . We know whats going on.', u'userId': u'u_145321627176216569', u'likeCount': 78, u'orderKey': u'score_00000000002462_14651303108023', u'children': [{u'hasNext': False, u'dislikeCount': 0, u'text': u'@marshmallowww What if I tell you that gender has nothing to do with it? Men have that "sixth sense" too.', u'userId': u'u_143741207696358239', u'likeCount': 56, u'children': [], u'isCollapsed': 0, u'mediaText': u'@marshmallowww What if I tell you that gender has nothing to do with it? Men have that "sixth sense" too.', u'section': u'', u'mentionMapping': {u'@marshmallowww': u'ab693MB'}, u'commentId': u'c_146513102333226094', u'type': u'text', u'status': 0, u'parent': u'c_146513031080236628', u'timestamp': 1465131023, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'the_hidden', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/28267060_100_15.jpg', u'timestamp': u'1437412076', u'userId': u'u_143741207696358239', u'hashedAccountId': u'aop4wG2', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/the_hidden'}, u'accountId': u'28267060', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146513102333226094', u'level': 2, u'suppData': {}, u'richtext': u'@marshmallowww What if I tell you that gender has nothing to do with it? Men have that "sixth sense" too.', u'childrenTotal': 0, u'isAnonymous': 0}], u'isCollapsed': 0, u'mediaText': u'Never let your waif follow you? Well she wouldnt follow you if you werent a dickhead. Women have the sixth sense . We know whats going on.', u'section': u'', u'mentionMapping': {u'dummy': u''}, u'commentId': u'c_146513031080236628', u'type': u'text', u'status': 0, u'parent': u'c_146493707813378457', u'timestamp': 1465130310, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'marshmallowww', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/33477821_100_134.jpg', u'timestamp': u'1453216271', u'userId': u'u_145321627176216569', u'hashedAccountId': u'ab693MB', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/marshmallowww'}, u'accountId': u'33477821', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146513031080236628', u'level': 1, u'suppData': {}, u'richtext': u'Never let your waif follow you? Well she wouldnt follow you if you werent a dickhead. Women have the sixth sense . We know whats going on.', u'childrenTotal': 20, u'isAnonymous': 0} 
{u'hasNext': True, u'dislikeCount': 0, u'text': u'But is correct that she can hit him? i mean, "no violence" right? if SHE is drunk and doing stupid things, and the husband go and hit her, is correct too? because equality.', u'userId': u'u_143329792027606743', u'likeCount': 54, u'orderKey': u'score_00000000001796_14651298735006', u'children': [{u'hasNext': False, u'dislikeCount': 0, u'text': u'@pcmasteracer yes it's correct', u'userId': u'u_143073218849877360', u'likeCount': 9, u'children': [], u'isCollapsed': 0, u'mediaText': u'@pcmasteracer yes it's correct', u'section': u'', u'mentionMapping': {u'@pcmasteracer': u'avnOvdq'}, u'commentId': u'c_146513013516459530', u'type': u'text', u'status': 0, u'parent': u'c_146512987350064451', u'timestamp': 1465130135, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'kkakuka97', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/26450856_100_3.jpg', u'timestamp': u'1430732188', u'userId': u'u_143073218849877360', u'hashedAccountId': u'a4j4NWy', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/kkakuka97'}, u'accountId': u'26450856', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146513013516459530', u'level': 2, u'suppData': {}, u'richtext': u'@pcmasteracer yes it's correct', u'childrenTotal': 0, u'isAnonymous': 0}], u'isCollapsed': 0, u'mediaText': u'But is correct that she can hit him? i mean, "no violence" right? if SHE is drunk and doing stupid things, and the husband go and hit her, is correct too? because equality.', u'section': u'', u'mentionMapping': {u'dummy': u''}, u'commentId': u'c_146512987350064451', u'type': u'text', u'status': 0, u'parent': u'c_146493707813378457', u'timestamp': 1465129873, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'pcmasteracer', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/default-avatar/1_62_100_v0.jpg', u'timestamp': u'1433297920', u'userId': u'u_143329792027606743', u'hashedAccountId': u'avnOvdq', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/pcmasteracer'}, u'accountId': u'27225255', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146512987350064451', u'level': 1, u'suppData': {}, u'richtext': u'But is correct that she can hit him? i mean, "no violence" right? if SHE is drunk and doing stupid things, and the husband go and hit her, is correct too? because equality.', u'childrenTotal': 7, u'isAnonymous': 0} 
{u'hasNext': False, u'dislikeCount': 0, u'text': u'I can hear the 'BONG!'', u'userId': u'u_13987497367750', u'likeCount': 30, u'orderKey': u'score_00000000001168_14650124142865', u'children': [{u'hasNext': False, u'dislikeCount': 0, u'text': u'@yajirobe__ but not boing', u'userId': u'u_13775281935884', u'likeCount': 4, u'children': [], u'isCollapsed': 0, u'mediaText': u'@yajirobe__ but not boing', u'section': u'', u'mentionMapping': {u'@yajirobe__': u'avgE1Y5'}, u'commentId': u'c_146513060674619430', u'type': u'text', u'status': 0, u'parent': u'c_146501241428653553', u'timestamp': 1465130606, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'siophang', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/11455251_100_2.jpg', u'timestamp': u'1377528193', u'userId': u'u_13775281935884', u'hashedAccountId': u'aBQK6qO', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/siophang'}, u'accountId': u'11455251', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146513060674619430', u'level': 2, u'suppData': {}, u'richtext': u'@yajirobe__ but not boing', u'childrenTotal': 0, u'isAnonymous': 0}], u'isCollapsed': 0, u'mediaText': u'I can hear the 'BONG!'', u'section': u'', u'mentionMapping': {u'dummy': u''}, u'commentId': u'c_146501241428653553', u'type': u'text', u'status': 0, u'parent': u'c_146493707813378457', u'timestamp': 1465012414, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'yajirobe__', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/16992199_100_5.jpg', u'timestamp': u'1398749736', u'userId': u'u_13987497367750', u'hashedAccountId': u'avgE1Y5', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/yajirobe__'}, u'accountId': u'16992199', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146501241428653553', u'level': 1, u'suppData': {}, u'richtext': u'I can hear the 'BONG!'', u'childrenTotal': 1, u'isAnonymous': 0} 
{u'hasNext': False, u'dislikeCount': 0, u'text': u'http://i.memeful.com/media/post/PRoPBdo_700wa_0.gif', u'userId': u'u_13907047642371', u'likeCount': 21, u'orderKey': u'score_00000000000967_14649476233018', u'children': [{u'hasNext': False, u'dislikeCount': 0, u'text': u'@kaylaruffalo mfw', u'userId': u'u_13907047642371', u'likeCount': 0, u'children': [], u'isCollapsed': 0, u'mediaText': u'@kaylaruffalo mfw', u'section': u'', u'mentionMapping': {u'@kaylaruffalo': u'adYKGQj'}, u'commentId': u'c_146494763324897147', u'type': u'text', u'status': 0, u'parent': u'c_146494762330186947', u'timestamp': 1464947633, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'kaylaruffalo', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/16005886_100_9.jpg', u'timestamp': u'1390704764', u'userId': u'u_13907047642371', u'hashedAccountId': u'adYKGQj', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/kaylaruffalo'}, u'accountId': u'16005886', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146494763324897147', u'level': 2, u'suppData': {}, u'richtext': u'@kaylaruffalo mfw', u'childrenTotal': 0, u'isAnonymous': 0}], u'isCollapsed': 0, u'mediaText': u'http://i.memeful.com/media/post/PRoPBdo_700wa_0.gif', u'section': u'', u'mentionMapping': {u'dummy': u''}, u'commentId': u'c_146494762330186947', u'type': u'media', u'status': 0, u'parent': u'c_146493707813378457', u'timestamp': 1464947623, u'embedMediaMeta': {u'embedImage': {u'type': u'ANIMATED', u'image': {u'url': u'http://img-comment-fun.9cache.com/media/872be169144077120242844098_700w_0.jpg', u'width': 500, u'height': 400}, u'animated': {u'url': u'http://img-comment-fun.9cache.com/media/872be169144077120242844098_700wa_0.gif', u'width': 500, u'height': 400}, u'video': {u'url': u'http://img-comment-fun.9cache.com/media/872be169144077120242844098_700wv_0.mp4', u'width': 500, u'height': 400}}}, u'user': {u'displayName': u'kaylaruffalo', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/16005886_100_9.jpg', u'timestamp': u'1390704764', u'userId': u'u_13907047642371', u'hashedAccountId': u'adYKGQj', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/kaylaruffalo'}, u'accountId': u'16005886', u'permissions': []}, u'isUrl': 1, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146494762330186947', u'level': 1, u'suppData': {}, u'richtext': u'[url]http://i.memeful.com/media/post/PRoPBdo_700wa_0.gif[/url]', u'childrenTotal': 1, u'isAnonymous': 0} 
{u'hasNext': False, u'dislikeCount': 0, u'text': u'Look at the dude in the red shirt run XD', u'userId': u'u_144176454299618603', u'likeCount': 15, u'orderKey': u'score_00000000000806_14651298710300', u'children': [{u'hasNext': False, u'dislikeCount': 0, u'text': u'@crazybrownguy he knew he was next', u'userId': u'u_13976607580627', u'likeCount': 1, u'children': [], u'isCollapsed': 0, u'mediaText': u'@crazybrownguy he knew he was next', u'section': u'', u'mentionMapping': {u'@crazybrownguy': u'agGWL5q'}, u'commentId': u'c_146514413390208345', u'type': u'text', u'status': 0, u'parent': u'c_146512987103009031', u'timestamp': 1465144133, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'lightfoot2012', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/17248879_100_6.jpg', u'timestamp': u'1397660758', u'userId': u'u_13976607580627', u'hashedAccountId': u'axZPvbp', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/lightfoot2012'}, u'accountId': u'17248879', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146514413390208345', u'level': 2, u'suppData': {}, u'richtext': u'@crazybrownguy he knew he was next', u'childrenTotal': 0, u'isAnonymous': 0}], u'isCollapsed': 0, u'mediaText': u'Look at the dude in the red shirt run XD', u'section': u'', u'mentionMapping': {u'dummy': u''}, u'commentId': u'c_146512987103009031', u'type': u'text', u'status': 0, u'parent': u'c_146493707813378457', u'timestamp': 1465129871, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'crazybrownguy', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/29662036_100_10.jpg', u'timestamp': u'1441764542', u'userId': u'u_144176454299618603', u'hashedAccountId': u'agGWL5q', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/crazybrownguy'}, u'accountId': u'29662036', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146512987103009031', u'level': 1, u'suppData': {}, u'richtext': u'Look at the dude in the red shirt run XD', u'childrenTotal': 1, u'isAnonymous': 0} 
{u'hasNext': True, u'dislikeCount': 0, u'text': u'http://i.memeful.com/media/post/kRp6z2w_700wa_0.gif', u'userId': u'u_144337172763285563', u'likeCount': 5, u'orderKey': u'score_00000000000626_14651301539010', u'children': [{u'hasNext': False, u'dislikeCount': 0, u'text': u'@wat_ya_doin I agree with that wife', u'userId': u'u_144337172763285563', u'likeCount': 3, u'children': [], u'isCollapsed': 0, u'mediaText': u'@wat_ya_doin I agree with that wife', u'section': u'', u'mentionMapping': {u'@wat_ya_doin': u'ay8yRoM'}, u'commentId': u'c_146513018506335085', u'type': u'text', u'status': 0, u'parent': u'c_146513015390105680', u'timestamp': 1465130185, u'embedMediaMeta': {u'dummy': []}, u'user': {u'displayName': u'wat_ya_doin', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/29948571_100_6.jpg', u'timestamp': u'', u'userId': u'u_144337172763285563', u'hashedAccountId': u'ay8yRoM', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/wat_ya_doin'}, u'accountId': u'29948571', u'permissions': []}, u'isUrl': 0, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146513018506335085', u'level': 2, u'suppData': {}, u'richtext': u'@wat_ya_doin I agree with that wife', u'childrenTotal': 0, u'isAnonymous': 0}], u'isCollapsed': 0, u'mediaText': u'http://i.memeful.com/media/post/kRp6z2w_700wa_0.gif', u'section': u'', u'mentionMapping': {u'dummy': u''}, u'commentId': u'c_146513015390105680', u'type': u'media', u'status': 0, u'parent': u'c_146493707813378457', u'timestamp': 1465130153, u'embedMediaMeta': {u'embedImage': {u'type': u'ANIMATED', u'image': {u'url': u'http://img-comment-fun.9cache.com/media/be90178a145186181304494323_700w_0.jpg', u'width': 319, u'height': 260}, u'animated': {u'url': u'http://img-comment-fun.9cache.com/media/be90178a145186181304494323_700wa_0.gif', u'width': 319, u'height': 260}, u'video': {u'url': u'http://img-comment-fun.9cache.com/media/be90178a145186181304494323_700wv_0.mp4', u'width': 318, u'height': 260}}}, u'user': {u'displayName': u'wat_ya_doin', u'avatarUrl': u'http://accounts-cdn.9gag.com/media/avatar/29948571_100_6.jpg', u'timestamp': u'', u'userId': u'u_144337172763285563', u'hashedAccountId': u'ay8yRoM', u'profileUrls': {u'a_dd8f2b7d304a10edaf6f29517ea0ca4100a43d1b': u'http://9gag.com/u/wat_ya_doin'}, u'accountId': u'29948571', u'permissions': []}, u'isUrl': 1, u'isLike': {u'value': 0}, u'permalink': u'http://9gag.com/gag/a4YM4n1#cs_comment_id=c_146513015390105680', u'level': 1, u'suppData': {}, u'richtext': u'[url]http://i.memeful.com/media/post/kRp6z2w_700wa_0.gif[/url]', u'childrenTotal': 3, u'isAnonymous': 0} 

Als Beispiel betrachten wir den Text aus dct ziehen kann dann über die dct["children"] laufen, um mehr Kommentare zu erhalten:

In [30]: params = {"appId": "", 
    ....:   "url": "", 
    ....:   "count": "2", 
    ....:   "level": "2", 
    ....:   "order": "score", 
    ....:   "mentionMapping": "true", 
    ....:   "origin": "9gag.com"} 

In [31]: js = "Request URL:http://comment-cdn.9gag.com/v1/cacheable/comment-list.json" 

In [32]: with requests.session() as s: 
    ....:   r = s.get(base) 
    ....:   soup = BeautifulSoup(r.content,"lxml") 
    ....:   links = [urljoin(base, a["href"]) for a in soup.select("a.badge-evt.point")][:1] 
    ....:   for link in links: 
    ....:     cont = s.get(link).content 
    ....:     soup = BeautifulSoup(cont,"lxml") 
    ....:     script = soup.find("script", text=re.compile('appId')).text 
    ....:     data = ast.literal_eval(script[script.find("{"):script.rfind("}") + 1]) 
    ....:     params["appId"] = data["appId"] 
    ....:     params["url"] = data["url"] 
    ....:     page_json = s.get(js, params=params).json() 
    ....:     for dct in page_json["payload"]["comments"]: 
    ....:       print(dct["text"]) 
    ....:       for child in dct["children"]: 
    ....:         print(child["text"]) 
    ....:     

Once again this is a post made by someone who has no idea what true love is. True love is jealous, painful, and difficult. It's a battle it always will be. You're either fighting yourself to be a better person, fighting life to give the other person the life they deserve or fighting the other person. But true love is worth all of it, its also beautiful, kind, gentle and warm. No relationship is perfect. There is not "8 ways to know". The one for you is the one who will put up with your shit but at the same time make you want to make yourself a better person. Your true love will get on your nerves, piss you off, hurt you, but they will also love you, hold you up when you can't and forgive you. True love is when you find someone you can stand beside through anything, someone who would never want to hurt you When you find someone you can trust no matter what. No one is perfect and there is more than one person in the world you can fall in love with, but when you find that person, you fi 
@celticdraconian this Is so true 
Comment complaining that this will lead straight to the "friendzone" 
Comment saying the "Friendzone" is not a thing. 

Sie können sehen, dass ich die Param-Anzahl auf 2 geändert habe, um alle Daten auf eine wirklich hohe Zahl wie "count":"1000" zu bringen, um alle Daten zu erhalten, wenn Sie weitere Kommentare auf der Seite laden:

1

sind ihre Kommentare über reactjs geladen, müssen Sie etwas, das führt Javascript aus, um den Kommentarbereich zu überdecken.

Ein paar zum Einstieg:

)