Reputation: 256761
Is there an API in Windows that can crack a url into parts?
The format of a URL is:
stackoverflow://iboyd:[email protected]:12386/questions/SubmitQuestion.aspx?useLiveData=1&internal=0#nose
\___________/ \___/ \________/ \____________________/ \___/ \___________________________/\_______________________/ \__/
| | | | | | | |
scheme username password hostname port path query fragment
Is there a function in (native) Win32 api that can crack a URL into parts:
stackoverflow
iboyd
password01
mail.stackoverflow.com
12386
questions/SubmitQuestion.aspx
?useLiveData=1&internal=0
nose
There are some functions in WinApi, but they fail to do the job because they don't understand schemes except the ones that WinHttp
can use:
both fail to understand urls such as:
ws://stackoverflow.com
(web-socket)wss://stackoverflow.com
(web-socket secure)sftp://fincen.gov/submit
(SSL file transfer)magnet:?xt=urn:btih:c4244b6d0901f71add9a1f9e88013a2fa51a9900
stratum+udp://blockchain.info
WinHttpCrackUrl actively prevents being used to crack URLs:
If the Internet protocol of the URL passed in for pwszUrl is not HTTP or HTTPS, then WinHttpCrackUrl returns FALSE and GetLastError indicates ERROR_WINHTTP_UNRECOGNIZED_SCHEME.
Is there another native API in Windows that can get parts of a url?
Here's how you do it in CLR (e.g. C#): (fiddle)
using System;
public class Program
{
public static void Main()
{
var uri = new Uri("stackoverflow://iboyd:[email protected]:12386/questions/SubmitQuestion.aspx?useLiveData=1&internal=0#nose");
Console.WriteLine("Uri.Scheme: "+uri.Scheme);
Console.WriteLine("Uri.UserInfo: "+uri.UserInfo);
Console.WriteLine("Uri.Host: "+uri.Host);
Console.WriteLine("Uri.Port: "+uri.Port);
Console.WriteLine("Uri.AbsolutePath: "+uri.AbsolutePath);
Console.WriteLine("Uri.Query: "+uri.Query);
Console.WriteLine("Uri.Fragment: "+uri.Fragment);
}
}
Outputs
Uri.Scheme: stackoverflow
Uri.UserInfo: iboyd:password01
Uri.Host: mail.stackoverflow.com
Uri.Port: 12386
Uri.AbsolutePath: /questions/SubmitQuestion.aspx
Uri.Query: ?useLiveData=1&internal=0
Uri.Fragment: #nose
Upvotes: 1
Views: 2819
Reputation: 51413
Is there an API in Windows that can crack a url into parts?
There is in Windows 10. The Uri class in the Windows Runtime is capable of decomposing a URI into its individual parts. This is not strictly part of the Windows API, but consumable by any Windows API application.
The following code illustrates its usage. It is written using the C++/WinRT language projection, requiring a C++17 compiler. If you cannot switch to a C++17 compiler, you can use the Windows Runtime C++ Template Library (WRL) instead to consume the Windows Runtime APIs.
#include <iostream>
#include <string>
#include <winrt/Windows.Foundation.h>
#pragma comment(lib, "WindowsApp.lib")
using namespace winrt;
using namespace Windows::Foundation;
int wmain(int argc, wchar_t* wargv[])
{
if (argc != 2)
{
std::wcout << L"Usage:\n UrlCracker <url>" << std::endl;
return 1;
}
init_apartment();
Uri const uri{ wargv[1] };
std::wcout << L"Scheme: " << uri.SchemeName().c_str() << std::endl;
std::wcout << L"Username: " << uri.UserName().c_str() << std::endl;
std::wcout << L"Password: " << uri.Password().c_str() << std::endl;
std::wcout << L"Host: " << uri.Host().c_str() << std::endl;
std::wcout << L"Port: " << std::to_wstring(uri.Port()) << std::endl;
std::wcout << L"Path: " << uri.Path().c_str() << std::endl;
std::wcout << L"Query: " << uri.Query().c_str() << std::endl;
std::wcout << L"Fragment: " << uri.Fragment().c_str() << std::endl;
}
This program digests any URI spelled out in the question. Using the input
stackoverflow://iboyd:[email protected]:12386/questions/SubmitQuestion.aspx?useLiveData=1&internal=0#nose
produces the following output:
Scheme: stackoverflow Username: iboyd Password: password01 Host: mail.stackoverflow.com Port: 12386 Path: /questions/SubmitQuestion.aspx Query: ?useLiveData=1&internal=0 Fragment: #nose
Error handling has been omitted. In case the Uri
c'tor is passed an invalid string, it throws an exception of type winrt::hresult_error. If you cannot use exceptions in your code, you can activate the type manually (e.g. using the WRL), and inspect the HRESULT
return values instead.
Upvotes: 1
Reputation: 256761
There are a number of functions available to native Windows developers:
Of these, InternetCrackUrl works.
URL_COMPONENTS components;
components.dwStructSize = sizeof(URL_COMPONENTS);
components.dwSchemeLength = DWORD(-1);
components.dwHostNameLength = DWORD(-1);
components.dwUserNameLength = DWORD(-1);
components.dwPasswordLength = DWORD(-1);
components.dwUrlPathLength = DWORD(-1);
components.dwExtraInfoLength = DWORD(-1);
if (!InternetCrackUrl(url, url.Length, 0, ref components)
RaiseLastOSError();
String scheme = StrLCopy(components.lpszScheme, components.dwSchemeLength);
String username = StrLCopy(components.lpszUserName, components.dwUserNameLength);
String password = StrLCopy(components.lpszPassword, components.dwPasswordLength);
String host = StrLCopy(components.lpszHostName, components.dwHostNameLength);
Int32 port = components.nPort;
String path = StrLCopy(components.lpszUrlPath, components.dwUrlPathLength);
String extra = StrLCopy(components.lpszExtraInfo, components.dwExtraInfoLength);
This means that
stackoverflow://iboyd:[email protected]:12386/questions/SubmitQuestion.aspx?useLiveData=1&internal=0#nose
is parsed into:
stackoverflow
iboyd
password01
mail.stackoverflow.com
12386
/questions/SubmitQuestion.aspx
?useLiveData=1&internal=0#nose
It sucks that InternetCrackUrl doesn't make a distinction between:
?query#fragment
and just mashes them together as ExtraInfo:
?useLiveData=1&internal=0#nose
?useLiveData=1&internal=0
#nose
So we have to do some splitting if we want the ?query
or the #fragment
:
/*
InternetCrackUrl returns ?query#fragment in a single combined extraInfo field.
Split that into separate
?query
#fragment
*/
String query = extraInfo;
String fragment = "";
Int32 n = StrPos("#", extraInfo);
if (n >= 1) //one-based string indexes
{
query = extraInfo.SubString(1, n-1);
fragment = extraInfo.SubString(n, MaxInt);
}
Giving us the final desired:
stackoverflow
iboyd
password01
mail.stackoverflow.com
12386
/questions/SubmitQuestion.aspx
?useLiveData=1&internal=0#nose
?useLiveData=1&internal=0
#nose
Upvotes: 2