Reputation: 59
PDF content stream
0.750000 0.000000 0.000000 -0.750000 0.000000 841.920044 cm
q
0.367090 0.000000 0.000000 0.367090 0.000000 0.000000 cm
0.000000 0.000000 0.000000 rg
0.000000 0.000000 0.000000 RG
0.410 w
BT
2 Tr
/F1 40.959999 Tf
1 0 0.000000 -1 847.679993 158.720001 Tm
[<3581>-10.000000<043B>-10.000000<18C5>-20.000000<4374>-10.000000<3635><084D>-20.000000<2195>-10.000000<477D>-10.000000<0B5E>-10.000000<1C3E>-10.000000<34F6>-10.000000<3E98>-20.000000<0003>] TJ
ET
/F1 40.959999 Tf
means pdf uses font F1, set fontsize 40.959999.
I hava a question about whether the actual font size is 40.959999 or not. For the font size 40 is too large, but the text showed in adobe arcrobat pro is not so large.
I get font size by calling TextPosition.getFontSizeInPt()
(Using PDFBOX),it returns 40.96.
I think this is not correct.
Can anybody tell me how to get the correct font size?
Do I need to consider the '0.750000 0.000000 0.000000 -0.750000 0.000000 841.920044 cm'
operator?
how to get font size using pdfbox
TextPosition.getFontSize returns the first value only.
TextPosition.getFontSizeInPt returns something like the first value scaled by the matrices.
it does not make sense in this pdf
Upvotes: 0
Views: 1006
Reputation: 59
public class PDFCustomTextStripper extends PDFTextStripper{
/**
* textPositon - pdraphicsstate
*/
private final Map<TextPosition, PDGraphicsState> textPositionPDGraphicsStates = new HashMap<>();
@Override
protected void processTextPosition(TextPosition text) {
textPositionPDGraphicsStates.put(text, getGraphicsState());
......
}
}
public float getActualFontSize() {
final float fontSizeInPt = getTextPosition().getFontSizeInPt();
try {
return Math.min(Math.abs(getPdGraphicsState().getCurrentTransformationMatrix().getScaleX() * fontSizeInPt),Math.abs(getPdGraphicsState().getCurrentTransformationMatrix().getScaleY() * fontSizeInPt));
} catch (Exception e) {
return fontSizeInPt;
}
}
Upvotes: 0