Reputation: 619
I am currently writing a uart-console on an ATMega1284p. It supposed to echo the characters back, so that the computer-side-console actually sees what is being typed and that is it for now.
Here is the problem: With ASCII it works perfectly fine, but if I am sending anything beyond ASCII e.g. a '§' my minicom shows "�§" '�' being the invalid or the '§' in case everything works fine. But getting the combination of both throws me off and I currently have no idea where the problem is!
Here is part of my code:
char c;
while(m_uart->recv(c) > 0) {
m_lineBuff[m_lineIndex++] = c;
if(c == '\r') {
c = '\n';
m_lineBuff[m_lineIndex++] = c;
m_sendCount = 2;
} else {
m_sendCount = 1;
}
this->send();
if(c == '\n') {
m_lineBuff[m_lineIndex++] = '\0';
// invoke some callbacks that handle the line at some point
m_lineIndex = 0;
}
}
m_lineBuff
is a self written (and tested) vector of chars. m_uart
is a self written (and also tested) UART driver for the micro-internal hardware uart. this->send
sends m_sendCount
bytes using m_uart
.
What I tried so far: I verified that the baud rates of minicom and my micro match (115200). I verified that the frequency is within the 2% range (micro is running at 20MHz). Both minicom and the micro are setup for 8n1. I verified that minicom works by hooking it up to a little-board I had lying around. On that board any utf-8 digit works just fine.
Does anyone see my mistake or does anyone have a clue at what I haven't considered?
I'll be happy to supply up to all of my code if you guys are interested in it.
EDIT/Elaboration:
Observation 1 (prior to starting this project)
The PC side program (minicom) can send and recieve characters to resp. from the microcontroller. It does not show the sent characters though.
Conclusion 1 (prior to starting this project)
The microcontroller side needs to send the characters back to the PC, so that you have the behaviour of a console. Thus I immediately send back any character I get.
Observation 2 (after implementing it)
When I press '§' (or any other character consisting of more than 1 byte) (using minicom) I see "�§".
Conclusion 2 (after implementing it)
Something I can't explain with my knowledge is going on. Maybe a small delay between the two bytes making up the character lead to minicom printing a '�' first because the first byte on it's own is indeed an invalid character, and when the second character comes in minicom realizes that it's acutally '§' but minicom doesn't remove/overwrite the '�'. If that is the problem, then how do I solve it? Does my microcontroller need to react faster/with less delay in between characters?
EDIT2:
I replaced the '?' with the actual character '�' using the power of copy and paste.
More tests I did
I tried the character '😹' and as I expexted (it backs my conclusion 2) and I got "���😹". '😹' by the way is a 4 byte character. Set the baud rate of micro and minicom to 9600: exact same behaviour. I managed to set minicom into hex mode: it sends regularly but outputs hex... When I send '😹' I get "f0 9f 98 b9" which (at least according to this site) is correct... Is that backing my conclusion 2? And more importantly: how do I get rid of that behaviour. It works with my little linux board instead of my micro.
Upvotes: 2
Views: 3175
Reputation: 7342
EDIT: the op discovered on his own that the odd behaviour he discovered is (probably) a bug of minicom itself. This post of mine clearly looses its value, unless the community thinks that it should be removed I would leave it here as a witness of possible workarounds when experiencing similar problems.
tl;dr: your pc application might not be interpreting UTF-8 correctly as it appears.
If we look at the Extended ASCII Code defined by ISO 8859-1,
A7
10100111
§§
=> Section sign
and according to this page, the UTF-8 encoding of § is
U+00A7
§c2 a7
=> SECTION SIGN
So my educated guess is that the symbol is still printed correctly because it belongs to the Extended ASCII Code with the same value a7
.
Either your end-application fails to correctly interpret the UTF-8 U (c2
) symbol, and that's why you get an ? printed out, or a component in the middle fails to pass the correct value forward. I am inclined to believe your output is an instance of the first case.
You claim that minicom works, I can not refute this claim, but I would suggest you to try the following things first:
NB: this is kind of an incomplete answer, but I couldn't get everything in the comments. If you're patient enough, please update your question with your findings and comment this answer to notify me. I'll get back here and update my answer accordingly.
Upvotes: 3