user8134177
user8134177

Reputation:

Using requests with a simple form on Python

I'm trying to scrape the example sentences for a specific french word using python, but the page I get back into python doesn't seem to have any results.

I've inspected the element of the search box and search button and included them as parameters. Perhaps I'm missing something?

http://www.online-languages.info/french/examples.php

import requests
from bs4 import BeautifulSoup

word = 'manger'
url='http://www.online-languages.info/french/examples.php'
params ={'word':word,'go':''}

response=requests.post(url, data=params)
soup = BeautifulSoup(response.text, 'html5lib')
print(soup.prettify())

Here's what I'm looking to get:

Edit: Here is the output of the result. It appears it may be using javascript. If that's the case, does anyone have a different library I could use?

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html dir="ltr" lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
 <head>
  <title>
   French example sentences :: Online-languages.info
  </title>
  <meta content="text/css" http-equiv="Content-Style-Type"/>
  <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
  <meta content="Database containing thousands of example sentences. Sentences are important for learning correct use of words." name="Description"/>
  <meta content="French language. French grammar. French vocabulary. Tests. Language certificate. Verbs. French phrases. French pronunciation. E-learning. Conversation." name="Subject"/>
  <meta content="French, French grammar, French dictionary, French vocabulary, French language, tests, French test, exam, fce, verbs, exercise, certificate, course, games" name="keywords"/>
  <link href="../style.css" rel="stylesheet" type="text/css"/>
 </head>
 <body style="background-image:url(./img/bg2.jpg);">
  <div align="center">
   <table bgcolor="white" border="0" cellpadding="6" cellspacing="0" style="-moz-border-radius:20px;" width="1000">
    <tbody>
     <tr>
      <td align="center" colspan="4">
       <table border="0" cellspacing="0" width="100%">
        <tbody>
         <tr>
          <td align="center" width="180">
           <a href="../">
            <img alt="Online-languages.info" border="0" src="img/logo.png"/>
           </a>
          </td>
          <td align="left" style="background: url('img/bg.png'); -moz-border-radius:20px; padding: 20px 20px 20px 20px; ">
           <h1 style="color:#fff; font-size:20pt;">
            French words in example sentences
           </h1>
           <h3 style="color:#fff; font-size:8pt; font-weight:normal;">
            French language resources at
            <a href="http://www.online-languages.info" style="color:white;">
             Online-languages.info
            </a>
           </h3>
          </td>
         </tr>
        </tbody>
       </table>
      </td>
     </tr>
     <tr>
      <td align="left" valign="top" width="180">
       <table cellpadding="0" cellspacing="0" class="t2" width="180">
        <tbody>
         <tr>
          <td>
           <a class="arect" href="index.php">
            Home
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="grammar.php">
            French grammar
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="phrases.php">
            French phrases
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="vocabulary.php">
            French vocabulary
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="trainer.php">
            Vocabulary trainer
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="picture-dictionary.php">
            Picture dictionary
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="dictionary.php">
            French dictionary
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="flashcards.php">
            Flashcards
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="audio.php">
            Audio
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="video.php">
            Video
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="translator.php">
            French translator
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="tests.php">
            French quizzes
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="examples.php">
            Examples of use
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="pronunciation.php">
            French pronunciation
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="news.php">
            News in French
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="applications.php">
            Language software
           </a>
          </td>
         </tr>
         <tr>
          <td>
           <a class="arect" href="mobile.php">
            Mobile phones
           </a>
          </td>
         </tr>
        </tbody>
       </table>
       <img alt="" border="0" height="0" src="http://whos.amung.us/swidget/fnhahzdo0ncz.gif" style="display:none;" width="0"/>
      </td>
      <td align="left" bgcolor="#ffffff" valign="top" width="90%">
       <script type="text/javascript">
        <!--
google_ad_client = "ca-pub-7058441231119392";
/* online-languages */
google_ad_slot = "3704078504";
google_ad_width = 728;
google_ad_height = 90;
//-->
       </script>
       <script src="http://pagead2.googlesyndication.com/pagead/show_ads.js" type="text/javascript">
       </script>
       <br/>
       <br/>
       <div align="justify">
        <div id="content">
         <iframe frameborder="0" height="650" src="http://www.dicts.info/examples.php?lang=French&amp;disa=1" width="95%">
         </iframe>
        </div>
       </div>
       <!-- cookieconsent2 by Silktide -->
       <script type="text/javascript">
        window.cookieconsent_options = {
learnMore: 'More info',
message: 'This website uses cookies to personalize content and to improve your experience on our website.',
link: 'https://www.google.com/policies/technologies/cookies/',
theme: 'light-bottom'
};
       </script>
       <script src="https://s3.amazonaws.com/cc.silktide.com/cookieconsent.latest.min.js" type="text/javascript">
       </script>
       <noscript>
        &lt;p&gt;We recommend you enable JavaScript to take full advantage of this website.&lt;/p&gt;
       </noscript>
      </td>
     </tr>
    </tbody>
   </table>
   <br/>
   <table width="700">
    <tbody>
     <tr>
      <td align="center">
       <a href="../english">
        <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/anglictina"/>
        <br/>
        English
       </a>
      </td>
      <td align="center">
       <a href="../german">
        <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/spanelstina"/>
        <br/>
        German
       </a>
      </td>
      <td align="center">
       <a href="../french">
        <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/francouzstina"/>
        <br/>
        French
       </a>
      </td>
      <td align="center">
       <a href="../spanish">
        <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/spanelstina"/>
        <br/>
        Spanish
       </a>
      </td>
      <td align="center">
       <a href="../russian">
        <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/rustina"/>
        <br/>
        Russian
       </a>
      </td>
      <td align="center">
       <a href="../chinese">
        <img alt="" border="0" height="60" src="http://fimg.seznam.cz/?spec=ft100x75&amp;url=http://www.jazyky-online.info/cinstina"/>
        <br/>
        Chinese
       </a>
      </td>
     </tr>
    </tbody>
   </table>
   <br/>
   <br/>
   <table cellpadding="10" style="background:url(img/bgfoot.jpg);" width="100%">
    <tbody>
     <tr>
      <td align="center">
       <font color="#0000aa">
        <a href="../licence.html">
         Licence
        </a>
        |
        <a href="../licence.html">
         Terms of use
        </a>
        |
        <a href="../licence.html#disclaimer">
         Disclaimer
        </a>
        |
        <a href="../licence.html#privacy">
         Privacy policy
        </a>
        |
        <a href="http://www.dicts.info/contact.php?s=Online-languages">
         Contact
        </a>
       </font>
       <br/>
       Copyright © 2007-2017, Online-languages.info
      </td>
     </tr>
    </tbody>
   </table>
  </div>
  <script type="text/javascript">
   var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
  </script>
  <script type="text/javascript">
   try {
var pageTracker = _gat._getTracker("UA-8795372-1");
pageTracker._trackPageview();
} catch(err) {}
  </script>
 </body>
</html>

Upvotes: 1

Views: 138

Answers (1)

Nick
Nick

Reputation: 7441

This works for me. Notice that I used the GET method and the URI that is referenced in the actual form on that page.

import requests

word = 'manger'
url ='http://www.dicts.info/examples.php'
headers = {'Referer': 'http://www.dicts.info/examples.php?disa=1&lang2=french&word=bon&go=Search'}
params = {'word':word,'disa':'1','lang2':'french'}

response = requests.get(url, params=params, headers=headers)
print(response.text)

UPDATE

It appears the PHP page checks to make sure there is an appropriate referer header sent with the request. So add one, as I did above (edited original).

Upvotes: 1

Related Questions