Reputation: 81
I have started a simple project that is capable of making a SIP establishment for voice call and then forwarding RTP packet through another endpoint. I used PJSUA2 pyhton library and right now I can answer voice calls in onIncomingCall callback. But I have been trying to capture raw RTP packets instead of transmitting them into sound device. But I could not find any way to do it.
I tried pjsua2.AudioMediaPort but it requires sound device. As this application is deployed on cloud there is no sound device therefore media.statTransmit returns no device error.
Thanks in advance for your response!
Upvotes: 0
Views: 49
Reputation: 331
I recently ran into a similar problem.
Have a look at the AudioMediaPort class, you can create a custom port to pipe the audio to, from the docs:
void createPort(const string &name, MediaFormatAudio &fmt)
Create an audio media port and register it to the conference bridge.
Parameters:
name – The port name.
fmt – The audio format.
I used this class to implement a callback to get direct access to the frames which you could re-pack to RTP, although in your use case you might just forward the audio. In the below example I just write the frames to a file and was able to verify it all worked.
class AudioMediaPort(pj.AudioMediaPort):
def __init__(self):
super().__init__()
self.name = "AudioMediaPort"
self.file = open(f"{self.name}.raw", "wb") # Open file in binary mode
def onFrameReceived(self, frame: pj.MediaFrame) -> None:
"""Callback triggered when an audio frame is received."""
if frame.buf:
self.file.write(bytes(frame.buf)) # Convert frame buffer to bytes and write
self.file.flush() # Ensure data is written immediately
else:
print("Received empty frame")
def onFrameRequested(self, frame: pj.MediaFrame) -> None:
"""Callback triggered when a frame is requested (not needed for writing)."""
pass # No need to generate frames
def close(self):
"""Ensure the file is closed properly."""
if self.file:
self.file.close()
class MyCall(pj.Call):
def __init__(self, acc: MyAccount, call_id: int):
super().__init__(acc, call_id)
self.media = None
self.media_port : AudioMediaPort
def onCallMediaState(self, prm: pj.OnCallMediaStateParam):
print(f"*** onCallMediaState")
call_info : pj.CallMediaInfoVector = self.getInfo()
for mi in call_info.media:
if mi.type == pj.PJMEDIA_TYPE_AUDIO:
audio_media : pj.AudioMedia = self.getAudioMedia(mi.index) # Get audio media
media_conf : pj.ConfPortInfo = audio_media.getPortInfo()
format : pj.MediaFormatAudio = media_conf.format
# Create and register the custom media port
self.media_port = AudioMediaPort()
self.media_port.createPort("callback_port", format)
def onCallState(self, prm: pj.OnCallStateParam) -> None:
ci = self.getInfo()
print(f"Call state: {ci.stateText}")
if ci.state == pj.PJSIP_INV_STATE_CONFIRMED:
print("Call is confirmed. Setting up media transmission.")
for i in range(len(ci.media)):
if ci.media[i].type == pj.PJMEDIA_TYPE_AUDIO:
print(f"Configuring audio media for index {i}")
self.media = self.getAudioMedia(i) # Ensure we get the right audio media
if self.media_port: # Ensure media port was created
self.media.startTransmit(self.media_port)
print("Audio transmission to custom media port started.")
if ci.state == pj.PJSIP_INV_STATE_DISCONNECTED:
print("Call ended. Cleaning up media.")
if self.media:
self.media.stopTransmit(self.media_port)
print("self.media_port.close()")
self.media_port.close()
self.media = None
self.media_port = None
Upvotes: 0
Reputation: 81
According to my research, neither Python nor C++ high-level APIs provide this flexibility. You should use pjlib directly, which is the core C library.
Upvotes: 0