Reputation: 813
I use Yaws
as a reverse proxy. Yaws lives in a docker container named yaws
which calls an NGINX
container named wsm
. Very rarely, I get a 500 Internal Server Error when called from Opera
or Edge
but not from within Yaws.
How to debug this?
The same is true when called in Linux with curl
: we get a HTTP/1.1 500 Internal Server Error after due time. Of course, Yaws is involved here, too.
CS-1 16:33:06 :/tmp# curl -v 'https://wp:[email protected]/equivoxx/2181/sitemap/?rrr=1&del=1&ren=1&rss=1&bak=1&x=1&sfp=1&lg=en&del_sm=1'
* Trying 123.456.240.136:443...
* TCP_NODELAY set
* Connected to sm.myurl.biz (123.456.240.136) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
[...]
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
* subject: CN=myurl.biz
* start date: Jun 15 00:00:00 2023 GMT
* expire date: Sep 13 23:59:59 2023 GMT
* subjectAltName: host "sm.myurl.biz" matched cert's "sm.myurl.biz"
* issuer: C=AT; O=ZeroSSL; CN=ZeroSSL RSA Domain Secure Site CA
* SSL certificate verify ok.
* Server auth using Basic with user 'wp'
> GET /equivoxx/2181/sitemap/?rrr=1&del=1&ren=1&rss=1&bak=1&x=1&sfp=1&lg=en&del_sm=1 HTTP/1.1
> Host: sm.myurl.biz
> Authorization: Basic d3A6Y2hlZiw0OA==
> User-Agent: curl/7.68.0
> Accept: */*
>
[working]
* Mark bundle as not supporting multiuse
< HTTP/1.1 500 Internal Server Error
< Connection: close
< Server: Yaws 2.0.9
< Date: Wed, 26 Jul 2023 14:34:28 GMT
< Content-Length: 47
< Content-Type: text/html
< Vary: Accept-Encoding
<
* Closing connection 0
* TLSv1.3 (OUT), TLS alert, close notify (256):
<html><h1>500 Internal Server Error</h1></html>
The message Mark bundle as not supporting multiuse
is not an error message, but a debug trace message of no consequence and does not relate to HTTP/1.1 500 Internal Server Error
(Mark bundle as not supporting multiuse HTTP/1.1 500 Internal Server Error)
E.g., the URL
https://sm.myurl.biz/equimyurl/2181/sitemap/
runs fine everywhere whereas this URL with a couple of parameters
https://sm.myurl.biz/equimyurl/2181/sitemap/?rrr=1&del=1&ren=1&rss=1&bak=1&x=1&sfp=1&lg=en&del_sm=1
delivers the error in the browser or curl. Due to the parameters, this URL does some database processing.
Inspecting the parameters, it turns out that only del_sm
triggers the 500 error. It deletes the old data by dropping some tables which then are recreated and thus renewed. This is none of Yaws business. Nothing to worry about, right?
wsm
Running the URL with the same parameters on the NGINX container wsm
from within the Yaws container yaws
(so curl
calls wsm) however is OK:
curl -v 'wsm/equivoxx/2181/sitemap?rrr=1&del=1&ren=1&rss=1&bak=1&x=1&sfp=1&lg=en&del_sm=1' | tee /tmp/2181-wsm1.htm
The output of curl is as it should be. Calling the output 2181-wsm1.htm
with a browser is flawless. The data is updated fine. Yaws should not alter the output. Nevertheless I get the 500 error.
I have set up a log file /tmp/tmp_'.$this->id_ex.'__hunt_500.log
recording messages to track the positions in the source code where an error might occur. No surprise. The last entry looks like
2023-07-24 19:56:26 elapsed_secs :14: -- L: 1189 M: Imp::_die HUNT_500 this->lg :en: this->id_ex :2181:
which tells me that this message was triggered in the module Imp
at method _die
which is the last thing the program does before exiting:
function _die() {
$this->dba->close();
$this->output->_display();
wp_str_to_file('L: '.__LINE__.' M: '.__METHOD__." this->lg :$this->lg: this->id_ex :$this->id_ex:",
'/tmp/tmp_'.$this->id_ex.'__hunt_500.log');
exit;
} # _die
Hence the program did what it should which, of course, the updated data proved already. Why then has Yaws a problem with the output nevertheless?
I use data caching by mnesia
with Yaws heavily. Of course, some Urls must not be cached. The parameter rrr
disables caching, so the output of the offending URL is not touched by Yaws but passed through instead.
The relevant appmod functions:
out(A) ->
S = check_intrusion(A#arg.querydata), % out(A) ->
if
S ->
out_hello(); % avoid bad Urls
true ->
out_check_log(A) % out, real content, filter for cache
end.
check_intrusion(U) -> % avoid bad Urls
case U of
undefined ->
0;
_ ->
L = ["/solr", "HelloThink", "XDEBUG_SESSION", "phpstorm", "invokefunction", "/TP/public"],
[string:find(U, N) || N <- L, string:find(U, N) /= nomatch] /= []
end.
out_check_log(A) ->
R = check_log(A),
if
R -> % logging turned on, pass through
Url = get_uri(A) ++ get_cookies_string(A),
pass_through(Url);
true ->
out_myurl(A) % real work, eventually saved by mnesia
end.
check_log(A) ->
Url = get_uri_string(A),
R = string:str(Url, "log=1") > 0,
R.
out_myurl(A) ->
Del = yaws_api:queryvar(A, "del"),
Lg = get_lang(A), % Lg is important
Request_path = get_request_path(A),
Lg_url = prepend_lg(Lg, Request_path),
if
Del /= undefined ->
url:delete_record(myurl, Lg),
% we have at least lg=... will delete if we have no GET param
url:delete_record(myurl, Lg_url);
% we have at least lg=... delete the record
true -> % out_myurl Del
ok
end,
case skip_this_url(Request_path) of
false -> % we want to save or retrieve this
Start = deb:get_timestamp(), % timestamp -- a bit late, we miss some cycles
url:init(), % mnesia
Html = select_or_record(A, Request_path, Start, Lg),
% should replace elapsed time with real data
Html; % we get cached data with fresh timestamp
true ->
pass_through(Request_path) % out_myurl skip_this_url
end.
skip_this_url(Url) -> % determine URLs not to be cached
R = string:str(Url, ".ico") > 0
orelse
string:str(Url, "search=") > 0 % search is so fast and should be fresh
% [...] % lots of conditions
orelse
string:str(Url, "rrr=1") > 0 % param to not cache
orelse
string:str(Url, ".well-known") > 0 % letsencrypt
orelse
string:str(Url, "uploads/") > 0
orelse
string:str(Url, "xframe/") > 0,
R.
pass_through(Request_path) ->
Url = check_scheme(Request_path),
inets:start(),
ssl:start(),
Url_2 = string:replace(Url, " ", "%2B"),
Http_result = httpc:request(Url_2),
{ok, {{V, S, R}, _, _}} = Http_result, % we get the results
case Http_result of
{ok, {{_Version, 200, _ReasonPhrase}, _Headers, Html}} ->
{html, Html}; % we get the results, display them
{ok, {{_Version, 401, _ReasonPhrase}, _Headers, Html}} ->
{html, Html}; % auth mechanism
_ ->
{ok, {{_, Status, ReasonPhrase}, _, _}} = Http_result,
not_200 % some kind of error
end.
As far as I can see, rrr=1
invokes pass_through
which just does a httpc:request
and passes the result on when we get a 200
.
So far, the error affects only the output of the updating mechanism without touching it. This output is rather irrelevant, so I could leave it at that. The data is updated fine, which is the main reason for the job. However, it bugs me, nevertheless.
How can I debug this? I don't use an interception module
in yaws.conf
which is noted for 500 errors. Are there any methods to investigate the 500 error?
wsm
works fine. No idea how this can lead to the error. Why does yaws
do that?
I stopped yaws in the yaws container with yaws --stop
and restarted with
rm -f /tmp/myurl*.log
yaws -i
to turn on debugging. I could not see any difference.
Inspecting myurl.log
, I find lots of entries of type
------2023-07-26 12:57:57---
139
url:delete_record RECORD NOT FOUND, mnesia:abort(not_exist)
Key "de"
which is just a notification from function delete_record(Table, Key)
where Key
cycles through all the 5 languages involved here.
Looking at the trace in yaws I see no difference: both versions end with
Mod:myurl line:575 'select_or_record(A)=SELECT_OR_RECORD==SELECT_NOT_FOUND=> INSERT_WEB_PAGE Cookie' ["<!-- : -->"]
Any idea where to look for?
Upvotes: 0
Views: 85