Federico Rubbi
Federico Rubbi

Reputation: 734

Python mechanicalsoup get web form

I am writing a standalone bot that log into JitJat, an anonymous instant messaging site and send a message to a user. I successfully do this and I reach the index where I select my recipient.

Finally, I can chat with somebody at the chat but whenever I try to get the form of the page I get None. I tried to do this many times, using robobrowser, requests and mechanicalsoup.

Here my script:

import mechanicalsoup

#Current url: https://jitjat.org/login.php
browser = mechanicalsoup.StatefulBrowser()
browser.open('https://jitjat.org/login.php')
print('Current url: ' + str(browser.get_url()))
browser.select_form()
browser['username'] = 'my username'
browser['password'] = 'my password'
browser.submit_selected()
print('Redirect: ' + str(browser.get_url()))

#Current url: https://jitjat.org/index.php
browser.follow_link("recipient username")
print('Redirect: ' + str(browser.get_url()))

#Current url: https://jitjat.org/chat.php?id=(recipient username)
browser.select_form()
print(browser.get_current_form().print_summary())

Here the page source:

<!DOCTYPE html>
<html>

<head>
	<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

	<title>JitJat - anonymous instant messaging</title>
    
    <link href="css/jChat.css" rel="stylesheet" media="screen" type="text/css" />
	
    <link href="css/reset.css" rel="stylesheet" media="screen" type="text/css" />
    <link href="css/bootstrap.css" rel="stylesheet" media="screen" type="text/css" />
    <link href="css/bootstrap-responsive.css" rel="stylesheet" media="screen" type="text/css" />
    <link href="css/user_css.css" rel="stylesheet" media="screen" type="text/css" />
   
    <script src="js/jquery.js" type="text/javascript"></script>
    <script src="js/bootstrap.js" type="text/javascript"></script>
    <script src="js/jChat.js" type="text/javascript"></script>
    <script src="js/jquery.nicescroll.js" type="text/javascript"></script>
    <script src="js/custom.js" type="text/javascript"></script> 
    
</head>

<body>
	
    <div class="navbar navbar-fixed-top">
      <div class="navbar-inner">
        <div class="container">
        
           <div id="header">  
                <div id="logo">
                    <a href="index.php"><h1>logo</h1></a>
                </div>
    
                <div id="info">
                    <ul id="userBox">
                        <li class="dropdown">
                            <a class="dropdown-toggle" data-toggle="dropdown" href="#">ni47gv2x9ne<b class="caret"></b></a>
                            <ul class="dropdown-menu">
                                <li><a href="?action=logout&token=20419ec4554207718f71e9c5255f3514"><span class="icon-off"></span> <strong>Logout</strong></a></li>
                            </ul>
                      	</li>
                    </ul>
                </div>
                <div class="clear"></div>
            </div>

        </div>
      </div>
    </div>
    
    <div class="container">
        <ul class="breadcrumb">
            <li><a href="index.php">Home</a><span class="divider">&raquo;</span></li>
            <li class="active">Chat</li>
        </ul>
    </div>
    



    
        <div class="container">
        <ul class="breadcrumb" align="center">
            <li><noscript>
<span style="font-weight: normal; color: #ff0000">You need to have scripting enabled to use Chat.</span>
</noscript>

<script type="text/javascript">
document.write('You are using <span style="font-weight: bold">Chat</span>.');
</script> Click to enter <a href="/chat.php?id=Bluebear&mode=messaging"><span style="font-weight: normal">Messaging</span></a> <img src="images/ui/question.png" width="13" height="13" style="vertical-align: -1px;" title="differences: click-to-send, manual refresh, manual scroll, multiple lines possible, delete single messages, see if message is unread"></li>

        </ul>
    </div>
    
    <div class="container">
    
	<!-- BOX -->
    <div class="box">
    
    	<div class="header">
    	
    	   <div style="z-index: 10; position: absolute; left: 88%; margin-top: 8px; width: 135px; height: 25px; border: 0px solid black;"><a href="/chat.php?id=Bluebear&nuke=1&mode=" title="delete all messages permanently">nuke conversation</a></div>

        <h4><img src="images/avatars/user1.png" width="14" height="14" style="vertical-align: -2px;"> <a href="/chat.php?id=bluebear&mode=">bluebear</a></h4>

        
        </div>

        
        
    
                <div class="content">
			<!-- jChat -->
            <ul class="messages-layout">
                <li class="messages"></li>
            </ul>
            <!-- Enter message field -->
             <span class="session_time">Online</span><span id="sample"></span>
            <div class="message-entry">
                <input type="text" id="text-input-field" class="input-sm" name="message-entry" /> 
                <div class="send-group">
                    <a href="#emoticons" data-option="emotions" class="attachEmotions" data-toggle="modal"></a>
                    <input type="submit" name="send-message" id="sendMessage" class="btn btn-primary" value="Send" />
                </div>
            </div>
            
            <!-- Emoticons Modal -->
            <div id="emoticons" class="modal hide fade">
                <div class="modal-header"><h4>Emotions</h4></div>
                <div class="modal-body"></div>
                <div class="modal-footer">
                    <a href="#" class="btn" data-dismiss="modal">Close</a>
                </div>
            </div>
            
            <!-- // jChat -->
                     
        </div>
        
    </div>
    <!-- // END BOX -->
    
    
    
    
     
    </div>
                  
</body> 
   
   <script>
   		$().Chat();
   </script>
   
</html>
How can I fix it? Do you have any idea to solve it?? Thank you for the attention!

Upvotes: 0

Views: 1029

Answers (1)

Federico Rubbi
Federico Rubbi

Reputation: 734

I solved the issue. For all those who have problems similar to this, due to particular html codes, I recommend using robobrowser, a smart cross-platform module (useful documentation here).

If for some reason you can not get the form of the page:

  1. Try to specify the id of the form, like:

    browser = RoboBrowser(id='FormId')

  2. Try to pass headers with requests module:

    import requests mysession = requests.session() open = mysession.get('website_url') print(open.headers)

    Add headers like so:

    mysession.headers = open.headers browser = RoboBrowser(id='FormId', session=mysession, history=True)

Hope it helps!

Upvotes: 1

Related Questions