Reputation: 5243
Actually I am working with OpenGL and I would like to put all my textures in MP4 in order to compress them.
Then I need to get it from MP4 on my Android
I need somehow decode MP4 and get frame by frame by request.
I found this MediaCodec
and this MediaMetadataRetriever
But I did not see approach how to request frame by frame...
If there is someone who worked with MP4, please give me a way where to go.
P.S. I am working with native way (JNI), so does not matter how to do it.. Java or native, but I need to find the way.
I make some kind of movie (just one 3d model), so I am changing my geometry as well as textures every 32 milliseconds. So, it is seems to me reasonable to use mp4 for tex because of each new frame (32 milliseconds) very similar to privious one...
Now I use 400 frames for one model. For geometry I use .mtr and for tex I use .pkm (because it optimized for android) , so I have around 350 .mtr files(because some files include subindex) and 400 .pkm files ...
This is the reason why I am going to use mp4 for tex. Because one mp4 much more smaller than 400 .pkm
Plase take a look at Edit1
Actually all that I need to know is there API of Android that could read MP4
by frames? Maybe some kind of getNextFrame()
Something like this
MP4Player player = new MP4Player(PATH_TO_MY_MP4_FILE);
void readMP4(){
Bitmap b;
b = player.getNextFrame();
///.... my code here ...///
I made such implementation on Java
public static void read(@NonNull final Context iC, @NonNull final String iPath)
long time;
int fileCount = 0;
//Create a new Media Player
MediaPlayer mp = MediaPlayer.create(iC, Uri.parse(iPath));
time = mp.getDuration() * 1000;
Log.e("TAG", String.format("TIME :: %s", time));
MediaMetadataRetriever mRetriever = new MediaMetadataRetriever();
long a = System.nanoTime();
//frame rate 10.03/sec, 1/10.03 = in microseconds 99700
for (int i = 99700 ; i <= time ; i = i + 99700)
Bitmap b = mRetriever.getFrameAtTime(i, MediaMetadataRetriever.OPTION_CLOSEST_SYNC);
if (b == null)
Log.e("TAG", String.format("BITMAP STATE :: %s", "null"));
long curTime = System.nanoTime();
Log.e("TAG", String.format("EXECUTION TIME :: %s", curTime - a));
a = curTime;
Log.e("TAG", String.format("COUNT :: %s", fileCount));
and here execution time
E/TAG: EXECUTION TIME :: 267982039
E/TAG: EXECUTION TIME :: 222928769
E/TAG: EXECUTION TIME :: 289899461
E/TAG: EXECUTION TIME :: 138265423
E/TAG: EXECUTION TIME :: 127312577
E/TAG: EXECUTION TIME :: 251179654
E/TAG: EXECUTION TIME :: 133996500
E/TAG: EXECUTION TIME :: 289730345
E/TAG: EXECUTION TIME :: 132158270
E/TAG: EXECUTION TIME :: 270951461
E/TAG: EXECUTION TIME :: 116520808
E/TAG: EXECUTION TIME :: 209071269
E/TAG: EXECUTION TIME :: 149697230
E/TAG: EXECUTION TIME :: 138347269
This time in nanoseconds == +/- 200 milliseconds... It is very slowly... I need around 30 milliseconds by frame.
So, I think this method is execution on CPU, so question if there a method that executing on GPU?
I found out that there is MediaCodec
also I found similar question here MediaCodec get all frames from video
I understood that there is a way to read by bytes, but not by frames...
So, still question - if there is a way to read mp4
video by frames?
Upvotes: 5
Views: 10448
Reputation: 5243
Actually I want to post my implementation for current time.
Here h file
#include <jni.h>
#include <memory>
#include <opencv2/opencv.hpp>
#include "looper.h"
#include "media/NdkMediaCodec.h"
#include "media/NdkMediaExtractor.h"
//Originally took from here
//Convert took from here
class NativeCodec
NativeCodec() = default;
~NativeCodec() = default;
void DecodeDone();
void Pause();
void Resume();
bool createStreamingMediaPlayer(const std::string &filename);
void setPlayingStreamingMediaPlayer(bool isPlaying);
void shutdown();
void rewindStreamingMediaPlayer();
int getFrameWidth() const
return m_frameWidth;
int getFrameHeight() const
return m_frameHeight;
void getNextFrame(std::vector<unsigned char> &imageData);
struct Workerdata
AMediaExtractor *ex;
AMediaCodec *codec;
bool sawInputEOS;
bool sawOutputEOS;
bool isPlaying;
bool renderonce;
void Seek();
ssize_t m_bufidx = -1;
int m_frameWidth = -1;
int m_frameHeight = -1;
cv::Size m_frameSize;
Workerdata m_data = {nullptr, nullptr, false, false, false, false};
Here cc file
#include "native_codec.h"
#include <cassert>
#include "native_codec.h"
#include <jni.h>
#include <cstdio>
#include <cstring>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <cerrno>
#include <climits>
#include "util.h"
#include <android/log.h>
#include <string>
#include <chrono>
#include <android/asset_manager.h>
#include <android/asset_manager_jni.h>
#include <android/log.h>
#include <string>
#include <chrono>
// for native window JNI
#include <android/native_window_jni.h>
#include <android/asset_manager.h>
#include <android/asset_manager_jni.h>
using namespace std;
using namespace std::chrono;
bool NativeCodec::createStreamingMediaPlayer(const std::string &filename)
AMediaExtractor *ex = AMediaExtractor_new();
media_status_t err = AMediaExtractor_setDataSource(ex, filename.c_str());;
if (err != AMEDIA_OK)
return false;
size_t numtracks = AMediaExtractor_getTrackCount(ex);
AMediaCodec *codec = nullptr;
for (int i = 0; i < numtracks; i++)
AMediaFormat *format = AMediaExtractor_getTrackFormat(ex, i);
int format_color;
AMediaFormat_getInt32(format, AMEDIAFORMAT_KEY_COLOR_FORMAT, &format_color);
bool ok = AMediaFormat_getInt32(format, AMEDIAFORMAT_KEY_WIDTH, &m_frameWidth);
ok = ok && AMediaFormat_getInt32(format, AMEDIAFORMAT_KEY_HEIGHT,
if (ok)
m_frameSize = cv::Size(m_frameWidth, m_frameHeight);
} else
//Asking format for frame width / height failed.
const char *mime;
if (!AMediaFormat_getString(format, AMEDIAFORMAT_KEY_MIME, &mime))
return false;
} else if (!strncmp(mime, "video/", 6))
// Omitting most error handling for clarity.
// Production code should check for errors.
AMediaExtractor_selectTrack(ex, i);
codec = AMediaCodec_createDecoderByType(mime);
AMediaCodec_configure(codec, format, nullptr, nullptr, 0);
m_data.ex = ex;
m_data.codec = codec;
m_data.sawInputEOS = false;
m_data.sawOutputEOS = false;
m_data.isPlaying = false;
m_data.renderonce = true;
return true;
void NativeCodec::getNextFrame(std::vector<unsigned char> &imageData)
if (!m_data.sawInputEOS)
m_bufidx = AMediaCodec_dequeueInputBuffer(m_data.codec, 2000);
if (m_bufidx >= 0)
size_t bufsize;
auto buf = AMediaCodec_getInputBuffer(m_data.codec, m_bufidx, &bufsize);
auto sampleSize = AMediaExtractor_readSampleData(m_data.ex, buf, bufsize);
if (sampleSize < 0)
sampleSize = 0;
m_data.sawInputEOS = true;
auto presentationTimeUs = AMediaExtractor_getSampleTime(m_data.ex);
AMediaCodec_queueInputBuffer(m_data.codec, m_bufidx, 0, sampleSize,
m_data.sawInputEOS ?
if (!m_data.sawOutputEOS)
AMediaCodecBufferInfo info;
auto status = AMediaCodec_dequeueOutputBuffer(m_data.codec, &info, 0);
if (status >= 0)
"output EOS");
m_data.sawOutputEOS = true;
if (info.size > 0)
// size_t bufsize;
uint8_t *buf = AMediaCodec_getOutputBuffer(m_data.codec,
static_cast<size_t>(status), /*bufsize*/nullptr);
cv::Mat YUVframe(cv::Size(m_frameSize.width, static_cast<int>
(m_frameSize.height * 1.5)), CV_8UC1, buf);
cv::Mat colImg(m_frameSize, CV_8UC3);
cv::cvtColor(YUVframe, colImg, CV_YUV420sp2BGR, 3);
auto dataSize = colImg.rows * colImg.cols * colImg.channels();
imageData.assign(, + dataSize);
AMediaCodec_releaseOutputBuffer(m_data.codec, static_cast<size_t>(status),
info.size != 0);
if (m_data.renderonce)
m_data.renderonce = false;
} else if (status < 0)
"output buffers changed");
auto format = AMediaCodec_getOutputFormat(m_data.codec);
"no output buffer right now");
} else
INFO CODE :: %zd", //
void NativeCodec::DecodeDone()
if (m_data.codec != nullptr)
m_data.sawInputEOS = true;
m_data.sawOutputEOS = true;
void NativeCodec::Seek()
AMediaExtractor_seekTo(m_data.ex, 0, AMEDIAEXTRACTOR_SEEK_CLOSEST_SYNC);
m_data.sawInputEOS = false;
m_data.sawOutputEOS = false;
if (!m_data.isPlaying)
m_data.renderonce = true;
void NativeCodec::Pause()
if (m_data.isPlaying)
// flush all outstanding codecbuffer messages with a no-op message
m_data.isPlaying = false;
void NativeCodec::Resume()
if (!m_data.isPlaying)
m_data.isPlaying = true;
void NativeCodec::setPlayingStreamingMediaPlayer(bool isPlaying)
if (isPlaying)
} else
void NativeCodec::shutdown()
m_bufidx = -1;
void NativeCodec::rewindStreamingMediaPlayer()
So, according to this implementation for format conversion (in my case from YUV to BGR) you need to set up OpenCV
, for understand how to do it check this two source
And also for sample I leave here my CMakeLists.txt
#For add OpenCV take a look at this video
#Look at the video than compare with this file and make the same
set(pathToOpenCv C:/OpenCV-android-sdk)
cmake_minimum_required(VERSION 3.4.1)
set(CMAKE CXX FLAGS "${CMAKE_CXX_FLAGS} -std=gnu++11")
# Import the ARCore library.
add_library(arcore SHARED IMPORTED)
set_target_properties(arcore PROPERTIES IMPORTED_LOCATION
# Import the glm header file from the NDK.
add_library(glm INTERFACE)
set_target_properties(glm PROPERTIES
# This is the main app library.
add_library(hello_ar_native SHARED
add_library(lib_opencv SHARED IMPORTED)
set_target_properties(lib_opencv PROPERTIES IMPORTED_LOCATION
target_include_directories(hello_ar_native PRIVATE
target_link_libraries(hello_ar_native $\{log-lib} lib_opencv
You need to create stream media player with this method
and then just use
Feel free to ask
Upvotes: 1
Reputation: 40370
Yes there is way to extract single frames from mp4 video.
In principle, you seem to look for alternative way to load textures, where usual way is GLUtils.texImage2D
(which fills texture from a Bitmap
First, you should consider what others advice, and expect visual artifacts from compression. But assuming that your textures form related textures (e.g. an explosion), getting these from video stream makes sense. For unrelated images you'll get better results using JPG or PNG. And note that mp4 video doesn't have alpha channel, often used in textures.
For the task, you can't use MediaMetadataRetriever
, it won't give you needed accuracy to extract all frames.
You'd have to work with MediaCodec and MediaExtractor classes. Android documentation for MediaCodec is detailed.
Actually you'll need to implement kind of customized video player, and add one key function: frame step.
Close thing to this is Android's MediaPlayer, which is complete player, but 1) lacks frame-step, and 2) is rather closed-source because it's implemented by lot of native C++ libraries which are impossible to extend and hard to study.
I advice this with experience of creating a frame-by-frame video player, and I did it by adopting MediaPlayer-Extended, which is written in plain java (no native code), so you can include this in your project and add function that you need. It works with Android's MediaCodec and MediaExtractor.
Somewhere in MediaPlayer class you'd add function for frameStep
, and add another signal + function in PlaybackThread
to decode just one next frame (in paused mode). However, the implementation of this would be up to you. Result would be that you let decoder to obtain and process single frame, consume the frame, then repeat with next frame. I did it, so I know that this approach works.
Another half of the task is about obtaining the result. A video player (with MediaCodec) outputs frames into a Surface
. Your task would be to get the pixels.
I know about way how to read RGB bitmap from such surface: you need to create OpenGL Pbuffer EGLSurface, let MediaCodec render into this surface (Android's SurfaceTexture
), then read pixels from this surface. This is another nontrivial task, you need to create shader to render EOS texture (the surface), and use GLES20.glReadPixels to obtain RGB pixels into a ByteBuffer. You'd then upload this RGB bitmaps into your textures.
However, as you want to load textures, you may find optimized way how to render the video frame directly into your textures, and avoid moving pixels around.
Hope this helps, and good luck in implementation.
Upvotes: 2
Reputation: 52303
The solution would look something like the ExtractMpegFramesTest, in which MediaCodec is used to generate "external" textures from video frames. In the test code, the frames are rendered to an off-screen pbuffer and then saved as PNG. You would just render them directly.
There are a few problems with this:
method has less-than-desirable performance for the reasons noted above. You're unlikely to get better results by writing it yourself, although you can save a bit of time by skipping the step where it creates a Bitmap object. Also, you passed OPTION_CLOSEST_SYNC
in, but that will only produce the results you want if all your frames are sync frames (again, clumsy database of JPEG images). You need to use OPTION_CLOSEST
.If you're just trying to play a movie on a texture (or your problem can be reduced to that), Grafika has some examples. One that may be relevant is TextureFromCamera, which renders the camera video stream on a GLES rect that can be zoomed and rotated. You can replace the camera input with the MP4 playback code from one of the other demos. This'll work fine if you're only playing forward, but if you want to skip around or go backward you'll have trouble.
The problem you're describing sounds pretty similar to what 2D game developers deal with. Doing what they do is probably the best approach.
Upvotes: 13
Reputation: 670
I can see why it might seem easy to have all your textures in a single file, but this is a really really bad idea.
MP4 is a video codec it is highly optimised for a list of frames which have a high level of similarity to adjacent frames i.e. motion. It is also optimised to be decompressed in sequential order, so using a 'random access' approach will be very inefficient.
To give a bit more detail video codecs store key frames (one a second, but the rate changes) and delta frames the rest of the time. The key frames are independently compressed just like separate images, but the delta frames stored as the difference from one or more other frames. The algorithm assumes this difference will be fairly minimal, after motion compensation has been performed.
So if you want to access a single delta frame you code will have to decompress a nearby key frame and all the delta frames that connect it to the frame you want, this will be much slower than just using single frame JPEG.
In short, use JPEG or PNG to compress your textures and add them all to a single archive file to keep it tidy.
Upvotes: 2