ThrowsError
ThrowsError

Reputation: 1689

Rust how to interact with a window

Introduction

Currently, I'm working for a customer who wants to automatize some actions inside their Accounting application.

Problem

I searched a crate for this but I didn't find anything, to read the screen of another window, and post some message like key interaction or click interaction.

Question

Does someone know a crate for interacting with another window in Rust? I need for the interaction: window screen reading, post some key message, and post some click message to this window.

Upvotes: 6

Views: 2583

Answers (2)

frankenapps
frankenapps

Reputation: 8241

There are various crates that let you simulate user input (e.g. mouse and keyboard input) even in a cross-platform fashion:

and crates for taking screenshots like

Apart from that there also is autopilot which lets you do both.

Here is an example for capturing the main windows' screen using autopilot and image (for actually storing the image):

use image::{GenericImageView, png::PNGEncoder};

fn main() {
    let bitmap = autopilot::bitmap::capture_screen().expect("Failed to capture main screen.");

    let mut buf = Vec::new();
    let encoder = PNGEncoder::new(
        &mut buf
    );
    encoder
        .encode(
            &bitmap.image.as_rgb8().unwrap(),
            bitmap.image.width(),
            bitmap.image.height(),
            image::ColorType::RGB(8),
        )
        .expect("Failed to encode png.");

    std::fs::write("test.png", buf).expect("Failed to write screenshot to disk.");
}

and here is an example for mouse input (move cursor in a circle):

const MARGIN: f64 = 10.0;
const MILLIS: u64 = 10;

struct Center {
    x: f64,
    y: f64,
}

fn main() {
    circle_mouse().expect("Unable to move mouse");
}

fn circle_mouse() -> Result<(), autopilot::mouse::MouseError> {
    let screen_size = autopilot::screen::size();
    let scoped_height = screen_size.height / 2.0 - MARGIN;
    let scoped_width = screen_size.width / 2.0 - MARGIN;
    let scoped_radius;

    if scoped_height > scoped_width {
        scoped_radius = scoped_width;
    }
    else {
        scoped_radius = scoped_height;
    }

    let center = Center { x: scoped_width, y: scoped_height };

    for i in 0..360 {
        let x = (i as f64 / 180.0 * std::f64::consts::PI).cos() * scoped_radius;
        let y = (i as f64 / 180.0 * std::f64::consts::PI).sin() * scoped_radius;
        autopilot::mouse::move_to(autopilot::geometry::Point::new(
            center.x + x as f64,
            center.y + y as f64,
        ))?;
        std::thread::sleep(std::time::Duration::from_millis(MILLIS));
    }

    Ok(())
}

If you only want to capture the area of the window, you can do so by taking a screenshot of the full desktop and then cropping it to the window only. On windows you can get the window rect of a certain window using GetWindowRect. Here is a snippet for getting the window rect using its ID or name.

Update due to request in comment

Here is a sample for how to only capture a specifc portion of the screen that contains a given window (works only on windows and the window must be fully visible on screen):

use autopilot::geometry::{Point, Rect, Size};
use std::{ffi::OsString, iter::once, os::windows::prelude::OsStrExt, ptr::null};
use windows_sys::Win32::{
    Foundation::{HWND, RECT},
    UI::WindowsAndMessaging::{FindWindowW, GetWindowRect},
};

fn main() {
    // The title of the window (is shown when hovering over the window in the taskbar):
    let window_name = OsString::from("autopilot");
    let window_name: Vec<u16> = window_name
        .as_os_str()
        .encode_wide()
        .chain(once(0))
        .collect();
    let id: HWND = unsafe { FindWindowW(null(), window_name.as_ptr()) };
    let mut rect = RECT {
        left: 0,
        top: 0,
        right: 0,
        bottom: 0,
    };
    if id != 0 && unsafe { GetWindowRect(id, &mut rect) } != 0 {
        /* println!(
            "HWND: {}\nLocation: {} {}\nSize: {} {}",
            id,
            rect.left,
            rect.top,
            rect.right - rect.left,
            rect.bottom - rect.top
        ); */

        let bitmap = autopilot::bitmap::capture_screen_portion(Rect::new(
            Point::new(rect.left as f64, rect.top as f64),
            Size::new(
                (rect.right - rect.left) as f64,
                (rect.bottom - rect.top) as f64,
            ),
        ))
        .expect("Failed to capture screen portion.");
        bitmap
            .image
            .save("screen_portion.png")
            .expect("Failed to write image to disk.");
    }
}

This sample has the following dependencies:

autopilot = "0.4.0"
windows-sys = { version = "0.36.1", features = ["Win32_Foundation", "Win32_UI_WindowsAndMessaging"] }

Note that this does include the invisible window borders which you probably do not want. You can use DwmGetWindowAttribute to correct for the visual offset like this:

use autopilot::geometry::{Point, Rect, Size};
use std::{
    ffi::{c_void, OsString},
    iter::once,
    mem::size_of,
    os::windows::prelude::OsStrExt,
    ptr::null,
};
use windows_sys::Win32::{
    Foundation::{HWND, RECT},
    Graphics::Dwm::{DwmGetWindowAttribute, DWMWA_EXTENDED_FRAME_BOUNDS},
    UI::WindowsAndMessaging::{FindWindowW, GetWindowRect},
};

fn main() {
    // The title of the window (is shown when hovering over the window in the taskbar):
    let window_name = OsString::from("autopilot");
    let window_name: Vec<u16> = window_name
        .as_os_str()
        .encode_wide()
        .chain(once(0))
        .collect();
    let id: HWND = unsafe { FindWindowW(null(), window_name.as_ptr()) };
    let mut rect = RECT {
        left: 0,
        top: 0,
        right: 0,
        bottom: 0,
    };
    if id != 0 && unsafe { GetWindowRect(id, &mut rect) } != 0 {
        /* println!(
            "Window:\nHWND: {}\nLocation: {} {}\nSize: {} {}",
            id,
            rect.left,
            rect.top,
            rect.right - rect.left,
            rect.bottom - rect.top
        ); */

        let frame = Box::new(RECT {
            left: 0,
            top: 0,
            right: 0,
            bottom: 0,
        });
        let frame_ptr = Box::into_raw(frame);
        let _res = unsafe {
            DwmGetWindowAttribute(
                id,
                DWMWA_EXTENDED_FRAME_BOUNDS,
                frame_ptr as *mut c_void,
                size_of::<RECT>() as u32,
            )
        };

        let frame = unsafe { Box::from_raw(frame_ptr) };

        let border = RECT {
            left: frame.left - rect.left,
            top: frame.top - rect.top,
            right: rect.right - frame.right,
            bottom: rect.bottom - frame.bottom,
        };

        let adjusted_rect = RECT {
            left: rect.left + border.left,
            top: rect.top + border.top,
            right: rect.right - border.right,
            bottom: rect.bottom - border.bottom,
        };

        // Window must be fully on screen for capture.
        if rect.left >= 0 && rect.top >= 0 {
            let bitmap = autopilot::bitmap::capture_screen_portion(Rect::new(
                Point::new(adjusted_rect.left as f64, adjusted_rect.top as f64),
                Size::new(
                    (adjusted_rect.right - adjusted_rect.left) as f64,
                    (adjusted_rect.bottom - adjusted_rect.top) as f64,
                ),
            ))
            .expect("Failed to capture screen portion.");
            bitmap
                .image
                .save("screen_portion.png")
                .expect("Failed to write image to disk.");
        }
    }
}

using those dependencies

autopilot = "0.4.0"
windows-sys = { version = "0.36.1", features = ["Win32_Foundation", "Win32_Graphics_Dwm", "Win32_UI_WindowsAndMessaging"] }

Update due to another comment

Yes you can use the PostMessageW function from the WinAPI in rust, too. Here is a simple sample, that contains the basic idea of the linked sample:

use std::{ffi::OsString, iter::once, os::windows::prelude::OsStrExt, ptr::null};
use windows_sys::Win32::{
    Foundation::HWND,
    UI::{
        Input::KeyboardAndMouse::VK_LEFT,
        WindowsAndMessaging::{FindWindowW, PostMessageW},
    },
};

const KEY_DOWN: u32 = 256;
const KEY_UP: u32 = 257;

fn main() {
    // The title of the window (is shown when hovering over the window in the taskbar):
    let window_name = OsString::from("autopilot");
    let window_name: Vec<u16> = window_name
        .as_os_str()
        .encode_wide()
        .chain(once(0))
        .collect();
    let id: HWND = unsafe { FindWindowW(null(), window_name.as_ptr()) };
    
    let res = unsafe { PostMessageW(id, KEY_DOWN, VK_LEFT.into(), 0) };
    if res == 0 {
        panic!("Failed to post message to window.");
    }

    std::thread::sleep(std::time::Duration::from_millis(500));
    let res = unsafe { PostMessageW(id, KEY_UP, VK_LEFT.into(), 0) };
    if res == 0 {
        panic!("Failed to post message to window.");
    }
}

it depends on

windows-sys = { version = "0.36.1", features = ["Win32_Foundation", "Win32_UI_Input_KeyboardAndMouse", "Win32_UI_WindowsAndMessaging"] }

If you want to detect certain UI elements on the screen and get their position, you would probably need to implement this yourself using pattern matching / computer vision using something like opencv and using the screenshot taken beforehand as input.

Upvotes: 2

Ricardo
Ricardo

Reputation: 1356

This is not a very easy thing to do... doing this might take a lot of time.

If you go for @frankenapps' answer, I would recommend that you use something like opencv or some kind of AI to recognize the UI and then click/do any action depending on it.

There is also another way to do such of task. You can use Frida which will let you attach to the program's functions and change values. You would have to do some analysis on the binary to understand it though.

Here is a nice example to understand how it works. And this is the rust-binding

Both ways are going to take a while but I just wanted to add another solution. Have fun!

Upvotes: 1

Related Questions