rajan sthapit
rajan sthapit

Reputation: 4354

Socket receive slower in python

I am doing some profiling of my python code. I am using async api for rpc calls

async def rpc_call():
    return client.call(params) // makes rpc calls over tcp

async def test():
    await rpc_call()


start = time.time()
loop = asyncio.get_event_loop()
tasks = []

for i in range(10000):
    tasks.append(asyncio.ensure_future(test()))

print("Starting the loop")
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
end = time.time()
print("Total time: {}".format(end - start))

It took almost like 25 seconds to finish it which is really slow

I took the cProfile dump of the process

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10001   13.362    0.001   13.362    0.001 {method 'recv_into' of '_socket.socket' objects}
    10004    0.760    0.000    0.802    0.000 {built-in method _socket.getaddrinfo}
1477355/1477208    0.404    0.000    0.812    0.000 {built-in method builtins.isinstance}
    70008    0.334    0.000    0.837    0.000 /usr/lib/python3.6/_collections_abc.py:824(update)
    20004    0.326    0.000    0.326    0.000 {method 'sendto' of '_socket.socket' objects}
   310031    0.313    0.000    0.373    0.000 /usr/lib/python3.6/urllib/parse.py:109(_coerce_args)
    20002    0.296    0.000    0.296    0.000 {method 'sendall' of '_socket.socket' objects}
   100010    0.272    0.000    0.828    0.000 /usr/lib/python3.6/urllib/parse.py:359(urlparse)
   140014    0.252    0.000    0.460    0.000 /usr/lib/python3.6/urllib/parse.py:392(urlsplit)
  1963120    0.250    0.000    0.250    0.000 {method 'lower' of 'str' objects}
   250161    0.236    0.000    0.408    0.000 /usr/lib/python3.6/abc.py:178(__instancecheck__)
    70007    0.222    0.000    0.427    0.000 /usr/lib/python3.6/http/client.py:1195(putheader)
   270027    0.221    0.000    0.762    0.000 /usr/lib/python3.6/_collections_abc.py:742(__iter__)
    10001    0.185    0.000   23.543    0.002 /home/me/venv2/lib/python3.6/site-packages/requests/sessions.py:589(send)
394556/394200    0.185    0.000    0.198    0.000 {built-in method builtins.hasattr}
    10001    0.175    0.000   17.859    0.002 /home/me/venv2/lib/python3.6/site-packages/urllib3/connectionpool.py:322(_make_request)
   330445    0.171    0.000    0.171    0.000 /usr/lib/python3.6/_weakrefset.py:70(__contains__)
   280028    0.168    0.000    0.210    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/structures.py:53(__getitem__)
   290033    0.164    0.000    0.202    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/structures.py:48(__setitem__)
    70007    0.162    0.000    1.329    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/sessions.py:50(merge_setting)
    10001    0.159    0.000   19.823    0.002 /home/me/venv2/lib/python3.6/site-packages/urllib3/connectionpool.py:447(urlopen)
    90011    0.157    0.000   13.575    0.000 {method 'readline' of '_io.BufferedReader' objects}
    10001    0.153    0.000    0.386    0.000 /usr/lib/python3.6/email/feedparser.py:471(_parse_headers)
    20002    0.152    0.000    1.032    0.000 /usr/lib/python3.6/email/feedparser.py:218(_parsegen)
   100010    0.148    0.000    0.309    0.000 /usr/lib/python3.6/email/message.py:462(get)
   380304    0.145    0.000    0.145    0.000 {method 'encode' of 'str' objects}
   100023    0.144    0.000    0.413    0.000 /usr/lib/python3.6/_collections_abc.py:657(get)
    70007    0.143    0.000    0.481    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/utils.py:244(to_key_val_list)
    10001    0.142    0.000    1.425    0.000 /usr/lib/python3.6/http/client.py:1241(_send_request)
    10001    0.142    0.000    1.845    0.000 /usr/lib/python3.6/http/client.py:194(parse_headers)
   150186    0.127    0.000    0.127    0.000 {method 'match' of '_sre.SRE_Pattern' objects}
    10001    0.123    0.000    0.159    0.000 /home/me/venv2/lib/python3.6/site-packages/urllib3/poolmanager.py:57(_default_key_normalizer)
    10001    0.122    0.000    0.190    0.000 /usr/lib/python3.6/socket.py:218(makefile)
    10001    0.121    0.000    0.121    0.000 {method 'SerializeToString' of 'google.protobuf.pyext._message.CMessage' objects}
    10001    0.117    0.000    0.524    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/models.py:433(prepare_headers)
    10001    0.115    0.000   22.245    0.002 /home/me/venv2/lib/python3.6/site-packages/requests/adapters.py:388(send)
    10001    0.112    0.000   15.728    0.002 /usr/lib/python3.6/http/client.py:290(begin)
    20002    0.112    0.000    0.297    0.000 /home/me/venv2/lib/python3.6/site-packages/urllib3/util/url.py:132(parse_url)
    10001    0.108    0.000    3.557    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/sessions.py:401(prepare_request)
    10001    0.104    0.000   31.352    0.003 /home/me/venv2/lib/python3.6/site-packages/pb/events/ingestor/ingestor_client.py:23(ingest)
   200986    0.104    0.000    0.104    0.000 {method 'split' of 'str' objects}
    10001    0.102    0.000   27.488    0.003 /home/me/venv2/lib/python3.6/site-packages/requests/sessions.py:441(request)
    70007    0.102    0.000    0.215    0.000 {method 'values' of 'collections.OrderedDict' objects}
   110011    0.101    0.000    0.129    0.000 /home/me/venv2/lib/python3.6/site-packages/urllib3/_collections.py:150(__getitem__)
    10001    0.100    0.000   13.613    0.001 /usr/lib/python3.6/http/client.py:257(_read_status)
    50006    0.098    0.000    0.794    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/structures.py:42(__init__)
   230850    0.095    0.000    0.095    0.000 {method 'get' of 'dict' objects}
    70007    0.091    0.000    0.167    0.000 /usr/lib/python3.6/email/_policybase.py:293(header_source_parse)
    10001    0.089    0.000    0.599    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/models.py:347(prepare_url)
    40018    0.087    0.000    0.148    0.000 /usr/lib/python3.6/os.py:664(__getitem__)
   350035    0.084    0.000    0.084    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/structures.py:60(<genexpr>)
    10001    0.083    0.000    0.150    0.000 /usr/lib/python3.6/http/client.py:1071(putrequest)
    20002    0.083    0.000    0.268    0.000 /usr/lib/python3.6/http/cookiejar.py:1657(extract_cookies)
    10001    0.081    0.000    0.807    0.000 /home/me/venv2/lib/python3.6/site-packages/urllib3/response.py:441(from_httplib)
    50005    0.081    0.000    0.331    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/structures.py:59(__iter__)
454337/443751    0.080    0.000    0.085    0.000 {built-in method builtins.len}
    70007    0.080    0.000    0.126    0.000 /usr/lib/python3.6/urllib/parse.py:188(_hostinfo)
    10001    0.079    0.000    0.231    0.000 /home/me/venv2/lib/python3.6/site-packages/urllib3/response.py:113(__init__)
    20002    0.077    0.000    0.388    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/utils.py:698(select_proxy)
    10001    0.074    0.000    1.068    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/adapters.py:253(build_response)
    40004    0.073    0.000    0.107    0.000 /usr/lib/python3.6/email/message.py:497(get_all)
   120012    0.072    0.000    0.161    0.000 /usr/lib/python3.6/email/_policybase.py:281(_sanitize_header)
   176461    0.071    0.000    0.071    0.000 {built-in method builtins.getattr}
    10001    0.069    0.000    0.351    0.000 /usr/lib/python3.6/urllib/parse.py:478(urljoin)
   100010    0.069    0.000    0.077    0.000 /usr/lib/python3.6/email/feedparser.py:78(readline)
    40004    0.068    0.000    0.287    0.000 /usr/lib/python3.6/email/message.py:564(get_content_type)
    10001    0.067    0.000    0.577    0.000 /home/me/venv2/lib/python3.6/site-packages/urllib3/response.py:346(read)
   141040    0.066    0.000    0.066    0.000 {built-in method __new__ of type object at 0x7f374cc230a0}
     9980    0.065    0.000    0.187    0.000 /usr/lib/python3.6/socket.py:155(__repr__)
    30004    0.065    0.000    0.134    0.000 /usr/lib/python3.6/http/cookiejar.py:1239(__init__)
    10000    0.065    0.000    0.084    0.000 /home/me/venv2/lib/python3.6/site-packages/urllib3/util/selectors.py:392(__init__)
    10000    0.063    0.000    0.453    0.000 /home/me/venv2/lib/python3.6/site-packages/urllib3/util/wait.py:9(_wait_for_io_events)
    10001    0.062    0.000    0.275    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/models.py:588(__init__)
    10001    0.062    0.000    0.439    0.000 /usr/lib/python3.6/http/client.py:1017(_send_output)
    30003    0.061    0.000    0.109    0.000 /usr/lib/python3.6/urllib/parse.py:459(urlunsplit)
   412748    0.060    0.000    0.060    0.000 {method 'append' of 'list' objects}
    70007    0.060    0.000    0.060    0.000 {method 'fullmatch' of '_sre.SRE_Pattern' objects}
    20003    0.059    0.000    0.070    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/hooks.py:17(default_hooks)
    69992    0.059    0.000    0.059    0.000 /usr/lib/python3.6/enum.py:515(__new__)
    70007    0.058    0.000    0.194    0.000 /usr/lib/python3.6/urllib/parse.py:154(hostname)
    10001    0.057    0.000    0.160    0.000 /usr/lib/python3.6/http/client.py:470(readinto)
    50006    0.056    0.000    0.110    0.000 /usr/lib/python3.6/http/cookiejar.py:1200(vals_sorted_by_key)
    10001    0.056    0.000   16.012    0.002 /usr/lib/python3.6/http/client.py:1287(getresponse)
    20002    0.055    0.000    0.575    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/cookies.py:119(extract_cookies_to_jar)
   100010    0.055    0.000    0.132    0.000 /usr/lib/python3.6/email/feedparser.py:128(__next__)
    59976    0.055    0.000    0.155    0.000 /usr/lib/python3.6/socket.py:97(_intenum_converter)
    30003    0.055    0.000    0.357    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/cookies.py:38(__init__)
    10003    0.054    0.000    1.079    0.000 /home/me/venv2/lib/python3.6/site-packages/statsd/client.py:141(__init__)
    20002    0.054    0.000    0.088    0.000 /home/me/venv2/lib/python3.6/site-packages/urllib3/util/url.py:22(__new__)
    10001    0.054    0.000    0.176    0.000 /usr/lib/python3.6/http/cookiejar.py:1334(add_cookie_header)
    10001    0.054    0.000    0.118    0.000 /usr/lib/python3.6/email/feedparser.py:139(__init__)
    30003    0.053    0.000    0.228    0.000 /usr/lib/python3.6/urllib/parse.py:448(urlunparse)
    20003    0.053    0.000    0.260    0.000 /home/me/venv2/lib/python3.6/site-packages/requests/cookies.py:503(cookiejar_from_dict)
    10004    0.053    0.000    0.976    0.000 /usr/lib/python3.6/socket.py:726(getaddrinfo)

So, I could see that majority of the time is spent on recv_into. Is it possible to know what is causing it to be so slow. I saw some threads about setting TCP_NODELAY to 1. I am using a library which is internally using the requests library http://docs.python-requests.org/en/master/.

Is it possible to know what is causing it to be this slow?

It looks like somehow the process is not able to make many connections at a time

python    22931 user    4u  IPv4 118136448      0t0  UDP *:36394 
python    22931 user    5u  IPv4 118136451      0t0  UDP *:60695 
python    22931 user    6u  IPv4 118157754      0t0  UDP *:36474 
python    22931 user   11u  IPv4 118157757      0t0  UDP *:50132 
python    22931 user   12u  IPv4 118157760      0t0  UDP *:43872 
python    22931 user   13u  IPv4 118157763      0t0  UDP *:47969 
python    22931 user   14u  IPv4 118157767      0t0  UDP *:47672 
python    22931 user   15u  IPv4 118157770      0t0  UDP *:36625 
python    22931 user   16u  IPv4 118157773      0t0  UDP *:40394 
python    22931 user   17u  IPv4 118157776      0t0  UDP *:46266 
python    22931 user   18u  IPv4 118157779      0t0  UDP *:46401 
python    22931 user   19u  IPv4 118157782      0t0  UDP *:37596 
python    22931 user   20u  IPv4 118157785      0t0  UDP *:36725 
python    22931 user   21u  IPv4 118157788      0t0  UDP *:55619 
python    22931 user   22u  IPv4 118157791      0t0  UDP *:53611 
python    22931 user   23u  IPv4 118157794      0t0  UDP *:34542 
python    22931 user   24u  IPv4 118157797      0t0  UDP *:43504 
python    22931 user   25u  IPv4 118157800      0t0  UDP *:34200 
python    22931 user   26u  IPv4 118157803      0t0  UDP *:55265 
python    22931 user   27u  IPv4 118157806      0t0  UDP *:50978 
python    22931 user   28u  IPv4 118157809      0t0  UDP *:58085 
python    22931 user   29u  IPv4 118156399      0t0  UDP *:46931 
python    22931 user   30u  IPv4 118156402      0t0  UDP *:40046 
python    22931 user   31u  IPv4 118156405      0t0  UDP *:57314 
python    22931 user   32u  IPv4 118156408      0t0  UDP *:34873 
python    22931 user   33u  IPv4 118156411      0t0  UDP *:55831 
python    22931 user   34u  IPv4 118154474      0t0  UDP *:60621 
python    22931 user   35u  IPv4 118154477      0t0  UDP *:59782 
python    22931 user   36u  IPv4 118154480      0t0  UDP *:54292 
python    22931 user   42u  IPv4 118148540      0t0  UDP *:37970 
python    22931 user   43u  IPv4 118150650      0t0  UDP *:55516 
python    22931 user   44u  IPv4 118151731      0t0  UDP *:59302 
python    22931 user   45u  IPv4 118152795      0t0  UDP *:58685 
python    22931 user   46u  IPv4 118145722      0t0  UDP *:34811 
python    22931 user   47u  IPv4 118154319      0t0  UDP *:43855 
python    22931 user   48u  IPv4 118146153      0t0  UDP *:38818 
python    22931 user   49u  IPv4 118146009      0t0  UDP *:36514 
python    22931 user   51u  IPv4 118157378      0t0  UDP *:56762 
python    22931 user   52u  IPv4 118157474      0t0  UDP *:39237 
python    22931 user   53u  IPv4 118154471      0t0  UDP *:44001 
python    22931 user   54u  IPv4 118146930      0t0  UDP *:34513 
python    22931 user   55u  IPv4 118147734      0t0  UDP *:50486 
python    22931 user   56u  IPv4 118157676      0t0  UDP *:60790 
python    22931 user   57u  IPv4 118157751      0t0  UDP *:3740

I was assuming that because of the async nature, there will be lots of connections at the same time waiting for the result. But here there are not many. I am also using connection pooling of size 2000. So shouldn't this have worked?

Upvotes: 1

Views: 1227

Answers (1)

user4815162342
user4815162342

Reputation: 154876

It seems that you expect your RPC calls to be executed created in parallel, but that's not what the code does.

The rpc_call coroutine calls client.call() without awaiting it, presumably because call is a function from a non-async library. This is blocking the event loop, i.e. it prevents other queued tasks from running. Even though the code looks like it should execute the calls in parallel, it is in reality sequential, equivalent to:

for i in range(10000):
    client.call(params)

recv_into dominating the profile simply means that the majority of time was spent waiting for responses. This is quite expected in sequential network code and means that the bottleneck is not CPU, it's either that the server is too slow or underutilized. Given your remark about the process not being able to make connections at the same time, it's the latter.

To benefit from asyncio, you should use an RPC library that actually uses asyncio under the hood. With such a library, client.call() would be an awaitable coroutine. The difference is that when such a coroutine blocks, it yields control to the event loop, allowing other tasks to make progress. The connections would be made in parallel and you would definitely see a difference.

Upvotes: 2

Related Questions