mit Anfragen und BS4, man muss nur die richtigen params passieren, die Sie sehen können, wenn Sie auf die Registerkarte Netzwerk in Chrom Entwickler-Tools zu buchen:
In [4]: from bs4 import BeautifulSoup
In [5]: import requests
In [6]: params = {"url": "search-alias=",
...: "field-keywords": "python"}
In [7]: with requests.Session() as s:
...: url = "https://www.amazon.co.uk/s"
...: r = s.get(url, params=params)
...: soup = BeautifulSoup(r.content,"lxml")
...: for a in cont:
...: print(a.select_one("a")["title"])
...:
Python Programming for the Absolute Beginner
Python: The Ultimate Beginner's Guide!
Automate the Boring Stuff with Python: Practical Programming for Total Beginners
Python: Learn Python in One Day and Learn It Well. Python for Beginners with Hands-on Project. (Learn Coding Fast with Hands-On Project Book 1)
Python Crash Course: A Hands-On, Project-Based Introduction to Programming
Learning Python
Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython
Python Cookbook
Python for Informatics: Exploring Information
Fluent Python
Python Playground: Geeky Projects for the Curious Programmer
Python in easy steps
Learn Python the Hard Way: A Very Simple Introduction to the Terrifyingly Beautiful World of Computers and Code (Zed Shaw's Hard Way)
Python: The Ultimate Beginners Guide: Start Coding Today
Programming the Raspberry Pi, Second Edition: Getting Started with Python
Data Science from Scratch: First Principles with Python
den Code in Funktionen brechen und alle t zu erhalten er Seiten müssen wir nur noch Looping zu halten, bis der Anker mit der ID pagnNextLink nicht sichtbar ist:
from bs4 import BeautifulSoup
import requests
from urlparse import urljoin
# from urllib.parse import urljoin -> python 3
def parse(soup):
yield [a["title"] for a in soup.select("a.a-link-normal.s-access-detail-page.a-text-normal")]
def get(term):
params = {"url": "search-alias=",
"field-keywords": term}
with requests.Session() as s:
head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36"}
url = "https://www.amazon.co.uk/s"
r = s.get(url, params=params)
soup = BeautifulSoup(r.content, "lxml")
nxt = soup.select_one("#pagnNextLink")
while nxt:
cont = requests.get(urljoin("https://www.amazon.co.uk/", nxt["href"]), headers=head)
soup = BeautifulSoup(cont.content,"lxml")
for t in parse(soup):
print(t)
nxt = soup.select_one("#pagnNextLink")
Wenn wir führen Sie den Code für ein paar Wiederholungen:
In [5]: get("python")
['Python Machine Learning', 'Effective Python: 59 Specific Ways to Write Better Python (Effective Software Development)', 'Black Hat Python: Python Programming for Hackers and Pentesters', 'Doing Math with Python: Use Programming to Explore Algebra, Statistics, Calculus, and More!', 'Think Python: How to Think Like a Computer Scientist', 'Python Basics, Level 1 (Coding Club) (Coding Club, Level 1)', 'Python for Finance: Analyze Big Financial Data', 'Violent Python: A Cookbook for Hackers, Forensic Analysts, Penetration Testers and Security Engineers', "Python Essential Reference (Developer's Library)", 'Learn Web Scraping With Python In A Day: The Ultimate Crash Course to Learning the Basics of Web Scraping With Python In No Time (Python, Python ... Python Books, Python for Beginners)', 'Programming Python', 'QPython - Python on Android', 'Coding Club Python: Next Steps Level 2', "Python: Programming, Master's Handbook; A TRUE Beginner's Guide! Problem Solving, Code, Data Science, Data Structures & Algorithms (Code like a PRO ... engineering, r programming, iOS development)", 'Python: Complete Crash Course for Becoming an Expert in Python Programming', 'Coding Club Python: Building Big Apps Level 3']
['High Performance Python: Practical Performant Programming for Humans', '25ft Python No Spill Clean And Fill', 'Learning Python with Raspberry Pi', 'Web Scraping with Python: Collecting Data from the Modern Web', 'Invent Your Own Computer Games with Python, 3rd Edition', 'More Python Programming for the Absolute Beginner', 'Python for Kids: A Playful Introduction to Programming', "Monty Python's Life of Brian", 'Python 3 Object-oriented Programming - Second Edition', 'Introduction to Computation and Programming Using Python', 'Evolution of The Silly Walks T Shirt - Funny TV Ministry - Various Colours and Sizes XS - 3XL', "Hacking Secret Ciphers with Python: A beginner's guide to cryptography and computer programming with Python", 'Monty Python Fluxx', 'MASTER LOCK 8417DPRO Python Cable 1.80 m x 5 mm 2 Keys', "Learn Python: A beginner's guide book to programming python, learning the basics and start coding easily", 'Master Lock Python Disc Cylinder Key Adjustable Braided Steel Cable Lock, 10 x 1800 mm - Black']
In [6]: get("c programming")
['C Programming', 'C# 6.0 in a Nutshell: The Definitive Reference', 'PIC microcontrollers Programming in C with examples', 'C++: The Ultimate Crash Course to Learning the Basics of C++ In No Time (c plus plus, C++ for beginners, programming computer, how to program) (HTML, Javascript, ... Java, C++ Course, C++ Development Book 3)', 'Java: The Best Guide to Master Java Programming Fast (Java for Beginners, Java for Dummies, how to program, java app, java programming): Volume 2 (C Programming, HTML, Javascript)', 'A Book on C.: Programming in C.', "Learn C the Hard Way: Practical Exercises on the Computational Subjects You Keep Avoiding (Like C) (Zed Shaw's Hard Way Series)", 'C++: C++ and Hacking for dummies. A smart way to learn C plus plus and beginners guide to computer hacking: Volume 10 (C Programming, HTML, Javascript, Programming, Coding, CSS, Java, PHP)', 'Introduction to Algorithms', 'Programming: Computer Programming for Beginners: Learn the Basics of Java, SQL & C++ - 2. Edition (Coding, C Programming, Java Programming, SQL Programming, JavaScript, Python, PHP)', '21st Century C: C Tips from the New School', 'C For Dummies', 'Learn C# Programming Training DVD - Tutorial Video', 'GT01-C30R2-6P Programming PLC Cable 2.5M for Mitsubishi Melsec A970', 'Programming In C', 'Get Coding!: Learn HTML, CSS & JavaScript & build a website, app & game']
['Hewlett Packard [HP] Calculator Financial Platinum RPN Algebraic Programmable Ref HP12C PLATINUM', 'C: Easy C Programming for Beginners, Your Step-By-Step Guide To Learning C Programming (C Programming Series)', '4.9M RS232 DB9 F/M PLC Programming Cable Adapter White for Omron CQM1 C200HE HG', 'KOREAN COSMETICS, LG Household & Health Care_ SUM37, Secret Programming Eye C...', 'C++: C++ and Python. C++ for Beginners and Python for Dummies to Learn Fast (C Programming, Programming for beginners, c plus plus, programming ... Developers, Coding, CSS, Java, PHP)', '1:8 Brushless Combo BLC-150C Plus + Ripper 2000KV motor + programming Board', 'Lonely Planet Italian Phrasebook & Audio', 'Full Forgiveness - Let Go of Hurt & Offense With Guided Imagery, Self Hypnosis and Neuro-linguistic Programming (NLP)', 'Accelerated C++: Practical Programming by Example (C++ in Depth Series)', 'Gardena Water Computer C1060plus 1864-20', 'Learning To Build Apps For iPhone and iPad - Training DVD', 'Practical C Programming (A Nutshell handbook)', 'Prince Brat and the Whipping Boy', 'English: Practice Test Papers (Letts Key Stage 2 Success) (Letts Key Stage 1 Success)', 'Arabic For Dummies: Audio Set', 'The Actor and the Text (Applause Acting Series)']
Sie können tun, was auch immer du magst es im parse, ich habe einfach die titel gezogen, damit wir leicht sehen können, dass wir die richtigen daten bekommen. Ich würde auch in Erwägung ziehen, einen Schlaf zwischen Anfragen hinzuzufügen.
youre ein Stern, das ist erstaunlicher Mann danke – entercaspa