This article is a continuation of Making software for mirroring Android screen to PC 1. How to make a function to mirror the screen of Android is explained there.
Allows you to operate your Android device with the mouse.
I'd like to have one connection between Android and PC, but I compromised because the program seemed to be complicated.
Display the data flowing from FFmpeg. Also, the mouse operation is converted to the touch operation on the screen and thrown to the terminal side.
As before, the contents of the screen are encoded and sent to the PC side. It also receives touch events from the PC side and intervenes in the system.
This section describes how Android devices handle touch and keystrokes. You can implement real-time touch without knowing it, so if you are not interested, skip to "Implementation".
How can I operate Android from a PC? Let's start with a simple example.
Android is a Linux-based operating system. So, although it's a limited edition, you can use a Linux terminal.
adb shell
You can enter the dialogue by doing. And since there is a command called ** input ** that sends touch operations and key operations, use it. An example is given below.
input touchscreen tap x y
input touchscreen swipe x1 y1 x2 y2
input keyevent Key
input text Text
By executing these, you can easily send an event to the terminal. But when I run it, it takes ** time before the event is fired **. This is not suitable for real-time operation. Also, what if you want to perform complicated operations other than tapping and swiping?
The ** getevent ** command is a command that outputs the data sent from the touch panel or physical key of the terminal. Try operating the touch panel after executing the following command.
getevent
Then, the numerical values will be displayed in a row as shown in the image below. This is the data sent from the touch panel. The OS interprets and reflects this data. How it is actually interpreted is explained in [Android] Touching the terminal from the program [ADB]. Please have a look.
** sendevent ** is a command that can send arbitrary data as if it were sent from a touch panel. In other words, the touch operation can be reproduced by resending the data obtained by ** getevent ** with ** sendevent **. However, this command is also slow to execute and cannot be completely reproduced.
Since Android is Linux-based, it uses device files. It may not be familiar to those who use non-Unix OS such as Windows, A device file is a special file used to interact with various connected devices. For example, if you open the device file of the touch panel, you can read the data related to the touch panel operation as obtained by getevent. Conversely, if you write to a device file, the written data will be treated as if it came from that device, similar to sendevent. Device file [Device Special File](https://uc2.h2np.net/index.php/%E3%83%87%E3%83%90%E3%82%A4%E3%82%B9%E3%82%B9 % E3% 83% 9A% E3% 82% B7% E3% 83% A3% E3% 83% AB% E3% 83% 95% E3% 82% A1% E3% 82% A4% E3% 83% AB) I think this area will be helpful.
Let's actually open the device file. I think you can see the path / dev / input / event4 in the image above. This will be the location of the device file. There are several device files such as event0, event1 .... in the / dev / input directory, which correspond to touch panels, physical keys, sensors, etc. ** Please note that the corresponding number differs depending on the terminal. ** **
cat /dev/input/event●
When you execute this command, the data from that device will flow. However, it becomes like this.
The raw data obtained from the device file is binary data, not displayed in characters. The getEvent, sendEvent, and the input command introduced at the beginning convert between binary and characters (numerical values).
By the way,
cat /dev/input/event● > /sdcard/event.bin
By doing so, you can save the raw data to a file.
cat /sdcard/event.bin > /dev/input/event●
Then you can reproduce the touch data, but there is a trap here as well. It's running too fast (bitter smile). The event data recorded in a few seconds will flow in an instant, so I think it's too early to operate properly. In order to reproduce the operation, it is necessary to insert sleep so that the data can be sent at an appropriate timing.
The point is that the touch event sent from the PC side can be sent to the device file. You may be able to write it with a shell script, but since we are developing with Android Studio, It's easy to write in Java / Kotlin. However, until now, I used to access device files as I like, but this is a technique that can only be achieved with shell privileges. In other words, regular apps don't have permission to mess with system files. (Unless it is rooted) I was about to give up, but I found this question. How does vysor create touch events on a non rooted device? How Vysor is a real-time touch event on a non-rooted device Are you running? Is the question. And the answer is
What he does is, he then starts his Main class as a separate process using this shell user. Now, the Java code inside that Main class has the same privileges as the shell user (because duh, it's linux).
What he's doing is using the Shell user's privileges to launch the Main class in a separate process. Now the Java code in the Main class has the same permissions as the Shell user (because Android is Linux).
In other words, if you execute the class included in the apk package from the shell, the launched program can also use the shell authority. It's no wonder, but I didn't think of it. This time I will implement real-time touch with this method.
As mentioned above, the power of the shell is required to intervene in the system area from the app. First, to start the class included in the apk package with shell privileges, do as follows.
sh -c "CLASSPATH=[Path to apk file] /system/bin/app_process /system/bin [package name].[The name of the class that contains the main method]"
The path to the apk file is
pm path [package name]
You can get it with, but if you normally debug in Android Studio here, multiple paths will be displayed. This is a phenomenon that occurs when using Instant Run, a function that shortens the build time and reflects program changes in real time. By dividing the apk into multiple parts, it seems that only the difference is built and installed at the time of change to save time. However, if it is split, the above command will not work properly, so you need to disable Instant Run. The location of the setting is as follows.
Furthermore, it should be noted that the application itself started by the user and the program started with shell privileges belong to the same package, but the processes are different, so static sharing etc. is not possible at all. Therefore, if you want to communicate, you will do it through socket etc.
First look at the code below.
InputService.java
public class InputService {
InputManager im;
Method injectInputEventMethod;
public InputService() throws Exception {
//Get an instance of InputManager
im = (InputManager) InputManager.class.getDeclaredMethod("getInstance").invoke(null, new Object[0]);
//Allows you to call static methods that generate MotionEvent
MotionEvent.class.getDeclaredMethod(
"obtain",
long.class, long.class, int.class, int.class,
MotionEvent.PointerProperties[].class, MotionEvent.PointerCoords[].class,
int.class, int.class, float.class, float.class, int.class, int.class, int.class, int.class
).setAccessible(true);
//Get a method to intervene an event in the system
injectInputEventMethod = InputManager.class.getDeclaredMethod("injectInputEvent", new Class[]{InputEvent.class, int.class});
}
//Generate touch event
public void injectMotionEvent(int inputSource, int action, float x, float y) throws InvocationTargetException, IllegalAccessException {
MotionEvent.PointerProperties[] pointerProperties = new MotionEvent.PointerProperties[1];
pointerProperties[0] = new MotionEvent.PointerProperties();
pointerProperties[0].id = 0;
MotionEvent.PointerCoords[] pointerCoords = new MotionEvent.PointerCoords[1];
pointerCoords[0] = new MotionEvent.PointerCoords();
pointerCoords[0].pressure = 1;
pointerCoords[0].size = 1;
pointerCoords[0].touchMajor = 1;
pointerCoords[0].touchMinor = 1;
pointerCoords[0].x = x;
pointerCoords[0].y = y;
MotionEvent event = MotionEvent.obtain(SystemClock.uptimeMillis(), SystemClock.uptimeMillis(), action, 1, pointerProperties, pointerCoords, 0, 0, 1, 1, 0, 0, inputSource, 0);
injectInputEventMethod.invoke(im, new Object[]{event, 0});
}
//Generate key event
public void injectKeyEvent(KeyEvent event)throws InvocationTargetException, IllegalAccessException{
injectInputEventMethod.invoke(im, new Object[]{event, 0});
}
}
This code is based on here and uses the new API. I've made some changes to use it.
We use shell privileges to access system methods and instances that are normally inaccessible by reflection. It's called "injectInputEvent", and if you pass event data to that method, the event will be executed. I looked at the AOSP source to see where the method really is, and found that here It was on line 914 of /hardware/input/InputManager.java). The Hide annotation is normally inaccessible. reflection? For those who say, this article may be helpful.
Execute the event sent from the personal computer using the above InputService class.
InputHost.java
public class InputHost {
static InputService inputService;
static ServerSocket listener;//Server socket
static Socket clientSocket;//Socket to client side
static InputStream inputStream;//For receiving messages from clients
static OutputStream outputStream;//Stream for sending data to the client
static boolean runnning = false;
public static void main(String args[]) {
try {
inputService = new InputService();
} catch (Exception ex) {
ex.printStackTrace();
}
try {
listener = new ServerSocket();
listener.setReuseAddress(true);
listener.bind(new InetSocketAddress(8081));
System.out.println("Server listening on port 8081...");
clientSocket = listener.accept();//Wait until connection
System.out.println("Connected");
inputStream = clientSocket.getInputStream();
outputStream = clientSocket.getOutputStream();
runnning = true;
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
while (runnning) {
String msg = reader.readLine();
String[] data = msg.split(" ");
if (data.length > 0) {
if (data[0].equals("screen")) {//For touch data
inputService.injectMotionEvent(InputDeviceCompat.SOURCE_TOUCHSCREEN, Integer.valueOf(data[1]), Integer.valueOf(data[2]), Integer.valueOf(data[3]));
} else if (data[0].equals("key")) {//For keys
inputService.injectKeyEvent(new KeyEvent(Integer.valueOf(data[1]), Integer.valueOf(data[2])));
} else if (data[0].equals("exit")) {//End call
Disconnect();
}
}
}
} catch (Exception e) {
e.printStackTrace();
Disconnect();
}
}
//Cutting process
private static void Disconnect() {
runnning = false;
try {
listener.close();
if (clientSocket != null)
clientSocket.close();
} catch (Exception ex) {
ex.printStackTrace();
}
System.out.println("Disconnected");
}
}
It's a simple server program. I also thought about modernizing the exchange of data with the PC using json, It's not a big deal, so I decided to send it in a space-separated csv-like format.
This is the only implementation on the Android side. ** The whole code is here **
Next, implement on the PC side.
I'm going to create it in C # & WPF. Why didn't you choose WinForms by design? This is because it was difficult to display the image at 60 FPS. When I implemented it, it was about 30 FPS. When DoubleBuffered was disabled, it became 60FPS, but the flicker was so great that it was not practical.
WPF also feels subtle, but it's better than WinForms because it uses the GPU for drawing. If you do it seriously, you will use a graphic API such as OpenGL (OpenTK for C #) ... I never thought that the receiving side would be the bottleneck rather than the sending side ^^;
** Click here for the entire client code (https://github.com/SIY1121/ScreenCastClient) **
UI
MainWindow.xaml
<Window x:Class="ScreenCastClient.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:local="clr-namespace:ScreenCastClient"
mc:Ignorable="d"
Title="ScreenCastClient" Height="819.649" Width="420.611" Loaded="Window_Loaded" Closing="Window_Closing">
<Grid>
<Grid.RowDefinitions>
<RowDefinition/>
<RowDefinition Height="60"/>
</Grid.RowDefinitions>
<Image x:Name="image" MouseDown="image_MouseDown" MouseUp="image_MouseUp" MouseMove="image_MouseMove"/>
<Grid Grid.Row="1" Background="#FF008BFF">
<Grid.ColumnDefinitions>
<ColumnDefinition Width="204*"/>
<ColumnDefinition Width="193*"/>
</Grid.ColumnDefinitions>
<Polygon Points="0,15 25,0 25,30" Fill="White" Margin="30,17,0,0" HorizontalAlignment="Left" Width="36" MouseDown="Polygon_MouseDown" MouseUp="Polygon_MouseUp" />
<Ellipse Fill="White" Margin="186,18,181,12" Width="30" HorizontalAlignment="Center" MouseDown="Ellipse_MouseDown" MouseUp="Ellipse_MouseUp" Grid.ColumnSpan="2"/>
<Rectangle Fill="White" Margin="0,17,30,10" HorizontalAlignment="Right" Width="30" MouseDown="Rectangle_MouseDown" MouseUp="Rectangle_MouseUp" Grid.Column="1"/>
</Grid>
</Grid>
</Window>
Here is a summary of the functions that will be used frequently in the future. Regular expressions are quite useful when you want to extract only the required numbers from some data. Further verification Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript These sites are very useful.
C#:MainWindow.xaml.cs
//Just run the command and return standard output
private string Exec(string str)
{
Process process = new Process
{
StartInfo =
{
FileName = "cmd",
Arguments = @"/c " + str,
UseShellExecute = false,
CreateNoWindow = true,
RedirectStandardInput = true,
RedirectStandardError = true,
RedirectStandardOutput = true
},
EnableRaisingEvents = true
};
process.Start();
string results = process.StandardOutput.ReadToEnd();
process.WaitForExit();
process.Close();
return results;
}
//Returns an array of data matched by regular expression
private string[] GetRegexResult(string src, string pattern)
{
Regex regex = new Regex(pattern);
Match match = regex.Match(src);
string[] res = new string[match.Groups.Count - 1];
for (int i = 1; i < match.Groups.Count; i++)
res[i - 1] = match.Groups[i].Value;
return res;
}
Applications of any OS have standard input / output functions.
In general video conversion software, input and output are done with files, but since FFmpeg can use standard input and output, you can read the data decoded from the program by setting the output destination to stdout. Also, in the case of FFmpeg, the specification is that the log is output from stderr. The following program starts FFmpeg and establishes a connection with stdout and stderr.
C#:MainWindow.xaml.cs
private void StartFFmpeg()
{
//Port settings
Exec("adb forward tcp:8080 tcp:8080");
var inputArgs = "-framerate 60 -analyzeduration 100 -i tcp://127.0.0.1:8080";
var outputArgs = "-f rawvideo -pix_fmt bgr24 -r 60 -flags +global_header - ";
Process process = new Process
{
StartInfo =
{
FileName = "ffmpeg.exe",
Arguments = $"{inputArgs} {outputArgs}",
UseShellExecute = false,
CreateNoWindow = true,
RedirectStandardInput = true,
RedirectStandardError = true,//Make stderr readable
RedirectStandardOutput=true//Make stdout readable
},
EnableRaisingEvents = true
};
process.ErrorDataReceived += Process_ErrorDataReceived;//Logs will flow from stderr, so process them separately.
process.Start();
rawStream = process.StandardOutput.BaseStream;//Data flows from stdout, so get the stream
process.BeginErrorReadLine();
running = true;
Task.Run(() =>
{
//Start reading in another thread
ReadRawData();
});
}
Set the required arguments and start FFmpeg. There are two ways to get the data from the output, the first is to register for the event and the second is to get the stream and read it yourself. The former can be easily obtained, but it cannot be used for exchanging binary data because it is converted to character data. The latter handles streams, so it's a bit cumbersome, but fine control is possible. This time, since the log flows from stderr, the former is read, and since the binary data of the image flows from stdout, the data is read by the latter method.
C#:MainWindow.xaml.cs
//Read standard error from FFmpeg
private void Process_ErrorDataReceived(object sender, DataReceivedEventArgs e)
{
if (e.Data == null) return;
Console.WriteLine(e.Data);
if (imageWidth == 0 && imageHeight == 0)//When the size to be sent is not yet confirmed
{
//Rough work of extracting size from FFmpeg output
string[] res = GetRegexResult(e.Data, @"([0-9]*?)x([0-9]*?), [0-9]*? fps");
if (res.Length == 2)
{
imageWidth = int.Parse(res[0]);
imageHeight = int.Parse(res[1]);
bytePerframe = imageWidth * imageHeight * 3;
if(imageWidth>imageHeight)//For landscape screen
{
//Swap the maximum and minimum touch coordinates
int tmp = displayWidth;
displayWidth = displayHeight;
displayHeight = tmp;
}
Dispatcher.Invoke(() => {//If you do not create a Bitmap in the UI thread, it cannot be reflected in the UI
writeableBitmap = new WriteableBitmap(imageWidth, imageHeight, 96, 96, PixelFormats.Bgr24, null);
image.Source = writeableBitmap;
});
}
}
}
The size of the image is essential to restore the raw data sent in the future. When FFmpeg starts conversion, it outputs the information of the stream to be output to the log, so it does a rough job of extracting the size of the image from it. A variable called bytePerFrame is the number of bytes required to generate a one-frame image. It can be calculated by the number of bytes used for the vertical x horizontal x 1 pixel of the image. This time, FFmpeg is set to output with rgb24 (8bit for rgb is 24bit each), so the number of bytes used for 1 pixel is 3. If you are worried about how images work, Don't you know? Basic knowledge of images and file structure. .
C#:MainWindow.xaml.cs
//Read rawStream from FFmpeg and write to Bitmap
private void ReadRawData()
{
MemoryStream ms = new MemoryStream();
byte[] buf = new byte[10240];
while (running)
{
int resSize = rawStream.Read(buf, 0, buf.Length);
if (ms.Length + resSize >= bytePerframe)//When the data read this time reaches or exceeds the data for one frame
{
int needSize = bytePerframe - (int)ms.Length;//The size of the remaining data required for one frame
int remainSize = (int)ms.Length + resSize - bytePerframe;//Size of surplus data
ms.Write(buf, 0, bytePerframe - (int)ms.Length);//Read the rest of the data needed in one frame
Dispatcher.Invoke(() =>
{
if (writeableBitmap != null)//Write data
writeableBitmap.WritePixels(new Int32Rect(0, 0, imageWidth, imageHeight), ms.ToArray(), 3 * imageWidth, 0);
});
ms.Close();
ms = new MemoryStream();
ms.Write(buf, needSize + 1, remainSize);//Write surplus data
}
else
{
ms.Write(buf, 0, resSize);//Accumulate data
}
}
}
Data is acquired from the stream and accumulated in MemoryStream. When the data for one frame is secured, the data in the Memory Stream is restored to the image. WritableBitmap has a method that restores from an array, so use that. Also, access to WritableBitmap must be done in the UI thread.
C#:MainWindows.xaml.cs
//Start InputHost and connect
private void StartInputHost()
{
string inputInfo = Exec("adb shell getevent -i");//Get data related to input of Android device
//Extract the maximum value of touch coordinates from the inside
string[] tmp = GetRegexResult(inputInfo, @"ABS[\s\S]*?35.*?max (.*?),[\s\S]*?max (.*?),");
displayWidth = int.Parse(tmp[0]);
displayHeight = int.Parse(tmp[1]);
//Port settings
Exec("adb forward tcp:8081 tcp:8081");
//Get the app path
//Remove extra characters and line feed codes
string pathToPackage = Exec("adb shell pm path space.siy.screencastsample").Replace("package:", "").Replace("\r\n", "");
Process process = new Process
{
StartInfo =
{
FileName = "adb",
Arguments = $"shell",
UseShellExecute = false,
CreateNoWindow = true,
RedirectStandardInput = true,
RedirectStandardError = true,
RedirectStandardOutput = true
},
EnableRaisingEvents = true
};
process.Start();
process.OutputDataReceived += (s, e) =>
{
Console.WriteLine(e.Data);//I may do something in the future
};
process.BeginOutputReadLine();
//Start InputHost with Shell privileges
process.StandardInput.WriteLine($"sh -c \"CLASSPATH={pathToPackage} /system/bin/app_process /system/bin space.siy.screencastsample.InputHost\"");
System.Threading.Thread.Sleep(1000);//Wait until it starts
TcpClient tcp = new TcpClient("127.0.0.1", 8081);//Connect to InputHost
streamToInputHost = tcp.GetStream();
}
First I'm running adb shell getevent -i and reading the output, This command displays data about Android input devices. Here, the maximum value of the coordinates that the touch panel can take is acquired. And we are starting InputHost. Please note that we are doing rough things such as waiting for 1 second until it starts.
Now that the preparations around the connection are complete, all we have to do is send the data to the InputHost.
C#:MainWindow.xaml.cs
private void image_MouseDown(object sender, MouseButtonEventArgs e)
{
Point p = GetDisplayPosition(e.GetPosition(image));
byte[] sendByte = Encoding.UTF8.GetBytes($"screen 0 {p.X} {p.Y}\n");
streamToInputHost.Write(sendByte, 0, sendByte.Length);
mouseDown = true;
}
private void image_MouseMove(object sender, MouseEventArgs e)
{
if (mouseDown)
{
Point p = GetDisplayPosition(e.GetPosition(image));
byte[] sendByte = Encoding.UTF8.GetBytes($"screen 2 {p.X} {p.Y}\n");
streamToInputHost.Write(sendByte, 0, sendByte.Length);
}
}
private void image_MouseUp(object sender, MouseButtonEventArgs e)
{
Point p = GetDisplayPosition(e.GetPosition(image));
byte[] sendByte = Encoding.UTF8.GetBytes($"screen 1 {p.X} {p.Y}\n");
streamToInputHost.Write(sendByte, 0, sendByte.Length);
mouseDown = false;
}
//Convert mouse position to terminal touch coordinates
private Point GetDisplayPosition(Point p)
{
int x = (int)(p.X / image.ActualWidth * displayWidth);
int y = (int)(p.Y / image.ActualHeight * displayHeight);
return new Point(x, y);
}
I'm sending data using an image mouse event. The second number 0,1,2 in the blank delimiter means 0 for Donw, 1 for Up, and 2 for Move. In addition, the key event is as follows.
C#:MainWindow.xaml.cs
private void Polygon_MouseDown(object sender, MouseButtonEventArgs e)
{
byte[] sendByte = Encoding.UTF8.GetBytes($"key 0 4\n");
streamToInputHost.Write(sendByte, 0, sendByte.Length);
}
private void Polygon_MouseUp(object sender, MouseButtonEventArgs e)
{
byte[] sendByte = Encoding.UTF8.GetBytes($"key 1 4\n");
streamToInputHost.Write(sendByte, 0, sendByte.Length);
}
private void Ellipse_MouseDown(object sender, MouseButtonEventArgs e)
{
byte[] sendByte = Encoding.UTF8.GetBytes($"key 0 3\n");
streamToInputHost.Write(sendByte, 0, sendByte.Length);
}
private void Ellipse_MouseUp(object sender, MouseButtonEventArgs e)
{
byte[] sendByte = Encoding.UTF8.GetBytes($"key 1 3\n");
streamToInputHost.Write(sendByte, 0, sendByte.Length);
}
private void Rectangle_MouseDown(object sender, MouseButtonEventArgs e)
{
byte[] sendByte = Encoding.UTF8.GetBytes($"key 0 187\n");
streamToInputHost.Write(sendByte, 0, sendByte.Length);
}
private void Rectangle_MouseUp(object sender, MouseButtonEventArgs e)
{
byte[] sendByte = Encoding.UTF8.GetBytes($"key 1 187\n");
streamToInputHost.Write(sendByte, 0, sendByte.Length);
}
The second blank delimiter has the same meaning as above. The third is the unique number of the key, which can be confirmed at KeyEvent. You can also send keys that are not implemented in your device. For example, 120 is assigned a PrintScreen key. Screenshots are usually taken by pressing the power key and volume down at the same time. You can take a screenshot just by sending this key. If you want to try it right away
adb shell input keyevent 120
Can be reproduced with.
Port forwarding, launching InputHost from shell, etc. All the troublesome things are implemented in the client software, so it's easy.
Only this. The screen is now projected and you can operate the screen with the mouse.
I was confused at first because executing the class in apk in Shell is not a normal application development, but I was able to implement it. Since adb has become indispensable with this function, it has become necessary to have adb installed if it is distributed to normal users. Most of all, now that only adb can be downloaded, I think the threshold for introduction has been lowered. (Is it included?)
Now I'm pretty close to Vysor. However, I haven't been able to input characters or transfer files yet, so I'd like to implement it next time. Then, thank you for watching until the end.
Recommended Posts