SdtElectronics
SdtElectronics

Reputation: 586

Best Practise to Read Desired Amount of Data from Serial?

Since read would only read all available data from a serial, this answer suggests to use a while loop to wait until the desired length of data have been read. But AFAIK system calls are expensive, thus, isn't that approach somehow crude, especially when the desired length is large? I know whether this would cause a performance issue depends on the actual application, and I should not do premature optimization. I am just curious that whether there is a better practise to avoid intensive system calls in this scenario?

Upvotes: 1

Views: 1881

Answers (3)

sawdust
sawdust

Reputation: 17047

But AFAIK system calls are expensive, ...

True, a system call can consume many more CPU cycles than a local procedure/function call. The system call requires CPU-mode transitions between user-mode to (protected) kernel mode, and then back to user mode.

... thus, isn't that approach somehow crude, especially when the desired length is large?

The first question you have to ask yourself when reading from a serial terminal (e.g. a /dev/ttyXn device)(rather than a serial port) is "what kind of data is going to be received, that is, is the data lines of (ASCII) text terminated by some type of EOL (end of line) character, or does the data need to be simply treated as binary (or raw) data?"

Lines of text should be read from a serial terminal using canonical (aka cooked) mode. The OS will perform a lexical scan of the received data for your program, and delimit each line of text based on the EOL characters you specify. The read() can then return a line assuming that blocking I/O is used, and the line of text is not longer than the buffer that is provided.

Binary data should be read from a serial terminal using noncanonical (aka raw) mode. The OS will ignore the values of the data, and (when blocking I/O is used) each read() will return an amount of data based on constraints of time and number of bytes.

See this answer for more details.

Note that the post your question refers to actually is about reading text, yet the OP is (mis)using non-canonical mode. If the OP had used the proper mode to match the input, then he might have never had a partial-read problem.


I am just curious that whether there is a better practise to avoid intensive system calls in this scenario?

Proper termios configuration is essential for efficient I/O with serial terminals.

Blocking I/O mode should be considered the preferred mode.
The OS can perform its scheduling for multitasking better when processes relinquish control more often.
The OS is more efficient in determining when data is available for return to a user.
Note also that the termios configuration is most effective when blocking mode is used, e.g. the VMIN and VTIME specifications in noncanonical mode.

For example using a select() or poll() and then a read() is one additional syscall more than when compared to just the (blocking) read(). And yet you can find many such code examples because there seems to be some popular misconception that the program can get the data faster from the "UART" that way.
But non-blocking and async modes are not necessarily faster (in a multitasking OS), and the read() merely fetches data from the termios buffer which is several layers removed from the actual hardware.

If your program uses non-blocking mode but does not perform useful work while waiting for data, and instead uses select() or poll() (or even worse calls sleep()), then your program is unnecessarily complex and ineffecient. See this answer.
A blocking-mode read() can do all that waiting for your program, make your program simpler and easier to write and maintain, and be more runtime efficient.

However for blocking non-canonical reads, you will have to accept some degree of inefficiency. The best you can do is trade-off latency versus the number of syscalls. One possible example is this answer which tries to fetch as much data per syscall, yet allow for an easy byte-by-byte lexical scan of the received binary data.


Note that a possible source of latency when reading a serial terminal is a poorly configured kernel, rather than the termios API and read() overhead.
For instance, setting the ASYNC_LOW_LATENCY flag via ioctl() (e.g. see High delay in RS232 communication on a PXA270) is one way to improve read() latency.

Upvotes: 1

Codo
Codo

Reputation: 78835

Serial connections (derived from RS-232 standard) are very slow connections by today's standard. They usually don't support higher data rates than 1 Mbps – compared to up to 20 Gbps for USB 4. So when working with serial connections, the system is mainly waiting. The key point is to avoid busy waiting, i.e. if not data is arriving, the system shouldn't not spend any time on the serial connection.

The linked code fails at that. As it sets the O_NDELAY flag on opening the port, read() will not block but instead immediately return 0 if no data is available. This will likely absorb 100% of a single CPU. It's not relevant if system calls are involved or not. The loop runs as quick and often as it can until new data arrives. It lacks a mechanism to wait without wasting CPU time.

The easiest solution is to not set the O_NDELAY flag. Then read() will block if no data is available. And it does so without spending any CPU time at all. Linux will wake up the thread or process once new data arrives.

If blocking isn't an option, there are many other options but they depend on the rest of your code, which we don't know anything about.

Furthermore, I recommend to be very careful with statements like system calls are expensive. In this form, the statement is wrong. Of course, everything has a cost in terms of CPU time, memory space, effective time etc. But without any numbers or without it relative to alternative operations, it causes more harm than good.

Upvotes: 0

Enrico Migliore
Enrico Migliore

Reputation: 229

To avoid intense system calls calling you might use the sleep() system function.

Reading data from a serial port is usually a thread business.

The thread sleeps for 100 ms and then reads data from the serial port and enque them in a FIFO queue.

Upvotes: 0

Related Questions