Note: I wrote UFPDF as an experiment, not as a finished product. If you have problems using it, don't bug me for support. Patches are welcome though, but I don't have much time to maintain this.
FPDF is a PHP class for generating PDF files on-the-fly. Unfortunately it does not support Unicode. So I've coded UFPDF, an extension of FPDF which accepts input in UTF-8.
Only TrueType fonts are supported for now. To embed .TTF files, you need to extract the font metrics and build the required tables using the provided utilities (see README.txt). Included is a modified version of TTF2PT1 which extracts the Unicode glyph info.
UFPDF works the same as FPDF, except that all text is in UTF-8, so consult the FPDF documentation for usage.

Maybe you made my day
I have ported some fpdf code to RealBASIC and i was stucked with the UTF-8 issue, i will look at your code and maybe it will help me on my migration process.
RealBASIC uses UTF-8 and other encodings, but primarly UTF8.
For an application that i'm developing i use spanish characters and other signs, so this could help me a lot and also to othe people using fpdf on RealBASIC (which by the way will be Freeware as fpdf) for special fonts.
Good work.
File lsansuni.php ?
Is this file expected to be included in the package or to be built myself?
Its absence prevents the ufpdf-test from working.
Philippe
Yes
If you read the included short blurb of text, you'll see that this is a file for a particular font (Lucida Sans Unicode) which I did not include for copyright reasons.
Too bad I didn't read this earlier
I recently completed an app which used japanese text. I used the mbfpdf class that's floating around. Of course, all the docs were Japanese, but I managed. Nice to know there is an alternative to EUC or SJIS.
UFPDF and Chinese/Japanese
I'm not sure UFPDF is a good solution for Japanese (or Chinese). It requires you to embed the TTF file in the PDF, which means several megabytes extra for Japanese fonts (even gzipped).
The normal procedure for Japanese text inside PDFs is to use special CID mappings and use the default Japanese fonts that come with Acrobat. However, I don't think there are predefined CID maps for Unicode, only for 'legacy' encodings like SJIS.
In theory it should be possible to make a CID map which maps UTF-16 to the mapping that Adobe uses. You'd need to create a CID map similar to the one I created based on the .ufm file (see makefontuni.php) and embed it so that one of the default Japanese fonts is used with it.
Great job! Just a problem, though...
Hi Steven,
great to have UTF-8 support in FPDF. Thanks a lot.
Just a problem: I seem not to be able to use the MultiCell method. Instead of getting automatic line breaks, I get long lines outside the margins.
Am I missing something?
Thanks for your kind attention.
Andrea
UTF-8 support in FPDF
Hi!
I readed with interest the comments, written by Andrea. It is just what happens to me also. I do not have automatic line breack, with MultiCell method. Any suggestion to explain the matter?? Thanks a lot.
roberto Perfetti
mancato ritorno a capo utilizzo di UTF-8 con FPDF
Buon giorno,
Mi permetto di contattarla, dopo aver letto il suo commento, relativo al problema del mancato ritorno a capo del testo.
Ho anchio il medesimo problema. Non riesco a risolverlo, malgrado i ripetuti tentativi.
Mi piacerebbe scambiare informazioni, con la speranza di risolvere l'inconveniente.
Grazie per l'interessamento.
Roberto Perfetti
Well, I sort of solved the pr
Well, I sort of solved the problem: indeed there is no support for multicell in UFPDF.
I've added it.
You can contact me at the following email address:
arossato AT istitutocolli DOT org
I'll send you the bits.
Andrea
Patch for Visual C++ 2003 Toolkit (free)
Please add this at the start of the winbuild.bat
It will save some time for a lot of people. Maybe you add the exe to the archive because there is no copyright issue in this.
:: --- intersol's patch for autobuild on VC 2003 Toolkit (free version)cd /D %VCToolkitInstallDir%
call vcvars32.bat
cd /D "%~dp0"
:: --- end of intersol's patch
PS. it would be great if UFPDF can become part of FPDF !!!
Problems with faulty characters (Czech)
First let me thank you for UFPDF, really great for languages like Czech which is not covered bu FPDF. I have been trying to use it and I discovered a problem. I have followed all the steps, prepared the Lucida Sans Unicode stuff and started using it when I discovered that the character "č" (a small c with a hacek) comes up as a capital C with a dot. How can I fix this myself? Is there a file which would perhaps contain a mistake - ascribe the wrong character to "č"? Capital Č works fine.
Thanks for any help
Roman Dergam
PS: I can send an attachment privately if needed
Do i really need to embed the fonts?
I want to know if it's posible to use the default fonts with UPDF. The specification of PDF file format requires the existense of such fonts. I want to generate some pdf files but the font face is not so important I want to use defaut ones.
How can i do that, without lossing UTF8 compatibility ?
Default fonts
PDF is built with legacy encodings in mind and supporting Unicode properly is tricky and a lot more complicated than it should be. The default fonts are accessed through named glyph mappings. As far as I know you can only use 256 glyphs at once with them.
You'd need a dynamic font mapper or something which splits up the text based on character ranges. That's more work than I'm willing to put in.
RE:Problems with faulty characters (Czech)
Sorry that I will not write in English, but I'm too lazy and I hope that who needs to read this can speak Czech.
Mel jsem stejny problem. Podle me to zapricinuje spatne napsany program myslim ze se jmenuje ttf2utf. Zda se ze to vygeneruje spatny neco.ufm - staci to zmenit. Ale staci, kdyz tam nejak prehazis a zmenis hodnoty nekde u veci: U 266, Cdot - to je to C s teckou to se meni na ccaron - to je č. Pak se tam prepise myslim nejaka hodnota asi W 500. Uz si to moc nepamatuju. Dyztak se ozvi na muj mail. A ja ti to poslu zmeneny, aby to fungovalo. No a pak jestli chces muzes vyresit muj problem - s ufdpf mi nejede Link(). No a pak jestli sis stahnul rozsireni pro annotace - je to na foru fpdf. A nejak slusne to zmenil, aby to chodilo jestli ti tam jde to č. Me tam bohuzel nejde a potrebuju to vyresit - myslim, ze v te textove poznamce se nepouziva ten spravny font a dela to bordel.
No snad jsem nejak pomohl.
Problem with Link and Annotations
Hi,
thank you for this great class. I'm working with Czech texts. I think that there is an error in program ttf2utf (maybe the name isn't correct). Czech character č - which has number 266 (010A) in UTF-8 and is called ccaron, is badly represented in *.ufm file. I had to change values becouse it was showing character (010D) Cdot. But I'm not writing primary becouse of this.
I want to use Link function, but can't get right url - it point to some crazy characters so its useless. Pls help me with this somehow.
Next problem is that I need to use annotations - there is extension for fpdf - you can find it in forum in fpdf.org. But in annotations that 'bad' character doesn't work - it still shows some big C, instead of the ccaron character. I think that it's caused becouse in annotations acrobat doesn't use font, which I created. So it makes the same problem which I had with normal functions like Write(and which I solved). I'm hopeless.
I can send you pdf file, where is text where is used ccaron character and showed correctly. And next to that is text annotation where this character is badly represented.
Thank you for your replies.
Hluchej
Czech characters
Ahoj, díky za radu, už vím, kde hledat, ale nepochopil jsem co a jak změnit. Je to takhle:
U 268 ; WX 692 ; N Ccaron ; G 252 ;
U 269 ; WX 512 ; N ccaron ; G 253 ;
U 273 ; WX 629 ; N dmacron ; G 254 ;
U 256 ; WX 690 ; N Amacron ; G 258 ;
U 257 ; WX 552 ; N amacron ; G 259 ;
U 258 ; WX 690 ; N Abreve ; G 260 ;
U 259 ; WX 552 ; N abreve ; G 261 ;
U 260 ; WX 690 ; N Aogonek ; G 262 ;
U 261 ; WX 552 ; N aogonek ; G 263 ;
U 264 ; WX 692 ; N Ccircumflex ; G 264 ;
U 265 ; WX 512 ; N ccircumflex ; G 265 ;
U 266 ; WX 692 ; N Cdot ; G 266 ;
U 267 ; WX 512 ; N cdot ; G 267 ;
a co teď? Jinak mi, prosím, pošli ten soubor na aurka zavinac centrum cz
Diky Roman
Link
HI,
I've already solved the problem with link function. But I still have problem with that faulty character ccaron. Pls help me.
write function make a different result with ufpdf
Hi,
I use this test code:
<?php$pdf->write(5,'jlmkjmljl');
$pdf->write(5,'aaaaaaaaaaaaaa');
?>
Whith fpdf, I've got 1 line
But with ufpdf, I've got 2 lines
How can I have 1 line with ufpdf ?
Thanks
hyperlinks with UFPDF ???
Maybe I am only doing something wrong, but I cant figure out how to produce clickable hyperlinks within the pdf-document. When I use the original FPDF everything is working fine.
Anyone have an idea how to produce clickable links with UFPDF ??
best
moloko
Something Wrong With Chinese Unicode
i have encountered a problem creating fonts in chinese
only a blank page of pdf is created with my fonts
anyone have experience on this?
is there anyone may share a per-builded chinese ttf fonts?
I have the same problem, but
I have the same problem, but i do not understand chech because we use this letter in lithuanian too :) . So if you have solved this, could you please translate that to english because i was not able to decipher your conversation in chech :)
Thanks!
Bon boulot!
Merci bcp pour ce portage.
Pour des besoins professionnels je devais supporter FPDF en UTF8.
J'ai gagné un sacré nombre d'heure :)
I have the some problem -
I have the some problem - Multicell :( Somebody help me?????
Czech "č" character
Hello, could you translate what you told about fixing the "č" problem. Or, may you send me the re-compiled exe on my mail address please, it's really important. Thanks a lot.
PS: I don't speak Czech, sorry.
Czech-Characters
Hello everybody,
I just fixed the problem with the faulty czech character "č" for myself and thought this could be helpful for you. (in my case the font is Lucida Sans Unicode, Windows-TTF-filename "l_10646.ttf" .. )
Simply open your ufm-File in an editor, mark the range from line "Ccaron" to "cdot" (pay attention to upper and lower case!!!) and overwrite it with the following block :
U 268 ; WX 692 ; N Ccaron ; G 252 ;
U 269 ; WX 692 ; N ccaron ; G 266 ;
U 273 ; WX 629 ; N dmacron ; G 254 ;
U 256 ; WX 690 ; N Amacron ; G 258 ;
U 257 ; WX 552 ; N amacron ; G 259 ;
U 258 ; WX 690 ; N Abreve ; G 260 ;
U 259 ; WX 552 ; N abreve ; G 261 ;
U 260 ; WX 690 ; N Aogonek ; G 262 ;
U 261 ; WX 552 ; N aogonek ; G 263 ;
U 264 ; WX 692 ; N Ccircumflex ; G 264 ;
U 265 ; WX 512 ; N ccircumflex ; G 265 ;
U 266 ; WX 512 ; N Cdot ; G 253 ;
U 267 ; WX 512 ; N cdot ; G 267 ;
Then save it, run makefont.php for this ufm-File, and it should work.
I had to move the files provided by "makefont.php" to another directory .. but that's just depending on how you handle it for yourselves. :-)
Greetings
Matthias
UTF-8, Write, MultiCell, ... Auto Line Break
Hi everybody, Salut à tous.
I just found an (easy) way to fix the "Line-break" issue while using ufpdf (in methods Write & MultiCell),
In fact it is in FPDF that some lines have to be modified, not in UFPDF.
Since UTF-8 'chars' can be of 1, 2 or 3 bytes, the following code do not work:
<?php$c = $s{$i};
$l = $cw[$c];
/* $l will always be 0.
Indeed $c (actually the first byte of $c) is 0. If you have a look into your font.php file, you'll see that there is nothing at index '0' in array $cw (it sounds ok since nul is not a valid character)
*/
?>
The fix consists in replacing all the:
<?php$l = $cw[$c];
?>
<?php$l = $cw[ord($c)];
?>
?>
ord($c) return the real "ascii" value of the character $c (at least it works in utf-8 and latin-1, I haven't checked other encodings).
Voilà.
Again, many thanks to both UFPDF & FPDF developers.
PS: Désolé pour la qualité de l'anglais, et pardon d'avance si j'ai dit une bêtise... en tout cas ça a l'air de marcher chez moi, en anglais, français, arabe et turc.
Arabic with UFPDF
Hello,
First thanks for the job.
I manage to have arabic written to PDF, but left to right instead of left to right, do you have any idea where I can find a way to solve this ?
Arabic in UFPDF
Bonjour,
Juste une question : comment ca marche chez vous en arabe ? je n'ai reussi qu'a faire ecrire de gauche a droite, pas de droite a gauche.
Merci d'avance s'il y a une reponse.
Link
Hello,
Could anybody explain me how to add an external link with UFPDF?
I only get a link with strange characters.
Thanks
ccaron problem
Hi,
I try your solution, but i am surprised : your block contains 13 lines and in my .ufm (in my case the fonts are ariali and arialbi.ttf), the block to overwrite contains 158 lines.
I am afraid to lose many other characters...
If anybody can send me by mail a corrected version of ttf2ufm.exe or ariali and arialbi fonts correctly generated (free fonts), i'll be happy !
pc@l-ami.net
Thanks
Trouble with Page Count
I've got a very strange issue, while this code works perfectly:
<?php
define('FPDF_FONTPATH', 'llib/font/');
include_once('llib/ufpdf.php');
class PDF extends FPDF
{
function Footer()
{
$this->SetX(15);
$this->SetY(15);
$this->SetFont('Arial','',14);
$this->Cell(0,10,'Page '.$this->PageNo().'/{nb}',0,0,'C');
}
}
$pdf=new PDF();
$pdf->AliasNbPages();
$pdf->Output();
?>
And prints a PDF with that text: "1/1"
but when I change the extents FPDF to UFPDF, just prints: "1/{nb}"
Someone knows any way to solve or workaround it?
Thanks
Displaying special symbols
Hi, everyone
I'm using fpdf for quite some time succesfully for my catalogues, I now need to include two special symbols in my text: ♀ and ♥.
So I found UFPDF and replaced my fpdf with it and hoped that the unicode instances (♀, and &9829;) in my text will be replaced by the symbols. But it still displays &#...
Thanks for any advise!!!
Benjamin
UTF-8
Enter the characters literally and use UTF-8 encoding, don't use entities.
Though you'll have to have a font which contains the symbols.
mysql character set
Hi, Steven,
thanks for your fast reply. Unfortunately my data is stored in a mysql database, which doesn't seem to have the necessary character sets installed. When I enter the heart symbol (via phpmyadmin) it changes into the entity.
So, that's that. I think I have to figure out the database-settings and then try it again.
Any experiences anyone?
Best wishes
Benjamin
Entity decoder (GPL)
I wrote the following entity decoder for Drupal (code is GPL licensed). If you run your text through it, all entities should be decoded to UTF-8 bytes.
Alternatively if you are using an ISO-8859-x encoding in your mysql database, you can just force your browser to UTF-8 encoding. UTF-8 is not corrupted when stored as ISO-8859-x, as it just stores the literal byte stream.
<?php
/**
* Decode all HTML entities (including numerical ones) to regular UTF-8 bytes.
* Double-escaped entities will only be decoded once ("&lt;" becomes "<", not "<").
*
* @param $text
* The text to decode entities in.
* @param $exclude
* An array of characters which should not be decoded. For example,
* array('<', '&', '"'). This affects both named and numerical entities.
*/
function decode_entities($text, $exclude = array()) {
static $table;
// We store named entities in a table for quick processing.
if (!isset($table)) {
// Get all named HTML entities.
$table = array_flip(get_html_translation_table(HTML_ENTITIES));
// PHP gives us ISO-8859-1 data, we need UTF-8.
$table = array_map('utf8_encode', $table);
// Add apostrophe (XML)
$table['''] = "'";
}
$newtable = array_diff($table, $exclude);
// Use a regexp to select all entities in one pass, to avoid decoding double-escaped entities twice.
return preg_replace('/&(#x?)?([A-Za-z0-9]+);/e', '_decode_entities("$1", "$2", "$0", $newtable, $exclude)', $text);
}
/**
* Helper function for decode_entities
*/
function _decode_entities($prefix, $codepoint, $original, &$table, &$exclude) {
// Named entity
if (!$prefix) {
if (isset($table[$original])) {
return $table[$original];
}
else {
return $original;
}
}
// Hexadecimal numerical entity
if ($prefix == '#x') {
$codepoint = base_convert($codepoint, 16, 10);
}
// Encode codepoint as UTF-8 bytes
if ($codepoint < 0x80) {
$str = chr($codepoint);
}
else if ($codepoint < 0x800) {
$str = chr(0xC0 | ($codepoint >> 6))
. chr(0x80 | ($codepoint & 0x3F));
}
else if ($codepoint < 0x10000) {
$str = chr(0xE0 | ( $codepoint >> 12))
. chr(0x80 | (($codepoint >> 6) & 0x3F))
. chr(0x80 | ( $codepoint & 0x3F));
}
else if ($codepoint < 0x200000) {
$str = chr(0xF0 | ( $codepoint >> 18))
. chr(0x80 | (($codepoint >> 12) & 0x3F))
. chr(0x80 | (($codepoint >> 6) & 0x3F))
. chr(0x80 | ( $codepoint & 0x3F));
}
// Check for excluded characters
if (in_array($str, $exclude)) {
return $original;
}
else {
return $str;
}
}
?>
Thanks for the script but...
Hi, Steven,
thanks a lot again. I tried your script
<?php$text = "hallo ♥";
echo decode_entities($text, $exclude = array('&'));
?>
and got the following interesting result:
hallo ♥
I also tried your second proposal and changed the character encoding to UTF-8, but same result.
Where am I wrong?
ben
ISO-8859-1
Those 3 characters are the iso-8859-1 interpretation of the original character in UTF-8 encoding. If you pass this literally to UFPDF it should work correctly. The entity decoder works with literal bytes, so changing your internal encoding in PHP to UTF-8 will mess it up.
If you have any more problems, you're on your own. I don't provide support for UFPDF.
Unable to extract the embedded font 'freeSans'.
Hi everyone.
i get blank page and error when opening pdf file.
Error says:
Unable to extract the embedded font 'freeSans'.
Is there solution for this error?
Thanks
Unicode vs. text-search-function in Acrobat Reader
Hello everybody,
I've got a problem with my (UF)PDF!
The file itself looks perfect! All characters (even czech, polish and hungarian) are displayed perfectly! In my script, I encode the whole text with utf8_encode() and save it in a MySQL database.
Now to my problem : when I got this (error free) PDF file, and try to search (Ctrl + F) for any word that is shown on any page of the PDF, the search return "0 entrys"! :-(
Even if I search for the first word on the first page .. "0 entrys" .. although I see this word on my screen!!! :-(
I'd be happy, if anyone could help me! Thanx a lot ...
Greetings
Matthias
"č" problem
I solved it by modifying only one line in ufm file
for example in verdana.ufm originaly was:
U 266 ; WX 698 ; N Cdot ; G 328 ;
i've changed into:
U 266 ; WX 521 ; N Cdot ; G 254 ;
G 254 i believe stands for "č", so I just replaced value.
WX is the size of character - you should modify it according to the font that your'e trying to fix.
and dont forget to launch makefontuni.php after modifying ufm file
License
Argh. Why did you take the very nice non-restrictive "public domain" license for FPDF and make this project into GPL? I'm all in favor of open source, but some of those who hire me are not. Now I can't use FPDF without converting the entire project to GPL.
Would you please consider something like the LGPL to make sure that a. anyone can include the code in commercial products as well and b. improvements to the code should still be free?
Free is subjective
Simple: I don't want others making a profit selling code which I write for free. You are free to use UFPDF to provide a commercial service though.
Also, UFPDF is just an experiment, not finished code.
Russian Font
Hi,
Is there some font which includes russian characters and the rest (czech, germany, french) as well? I have tried Lucida Sans Unicode, but russian characters were wrong :(
Or I have to switch the font for each character? - this is boring :(((
I am using TCPDF library as well but I hope this is not the problem
Thank you
Could not include font metric file
I have been trying to implement UFPDF into my programs which currently use FPDF. We have changed to use UTF-8 within our system and now must change the pdf generation programs. I have also tried TCPDF and do not seem to be able to get anywhere with either. The following is a simple program that I have been trying to get this working in. I'm currently getting the following error "Could not include font metric file".
<?php
include_once("../set_env_pdf.php");
define('FPDF_FONTPATH','ufpdf/font/');
include_once('ufpdf/ufpdf.php');
$pdf=new PDF(P,mm,Letter,true);
$pdf->AliasNbPages();
$pdf->AddPage();
$pdf->SetX(2);
$pdf->SetFont('vera','',12);
$pdf->SetX(5);
$pdf->Cell(0,9,'Hello World',0,0,'L');
$pdf->Ln(3.5);
$pdf->Output();
?>
I am using the fonts that were downloaded with TCPDF (vera.php, vera.z, vera.ctg.z) and they are in the fonts directory 1 down from the ufpdf directory ( This font has been declared in fpdf.php). The program ufpdf.php is in the ufpdf directory. I'm at a loss as to why I keep getting this error and everything I read on it tells me that the font is in the wrong directory but I don't see that as the issue in this case. Any assistance that anyone could offer would be greatly appreciated.
Thanks,
Joanne
No support
Note the big bold letters at the top that say "If you have problems using it, don't bug me for support.".
Compile errors
Attempting to compile ttf2ufm on CentOS-4 with freetype I get the following error:
gcc -O2 -D_GNU_SOURCE -DUSE_FREETYPE -I/usr/include -I/usr/include/freetype2 -DPREFER_FREETYPE -c ft.cI fil inkluderad från ft.c:15:
/usr/include/freetype2/freetype/freetype.h:20:2: #error "`ft2build.h' hasn't been included yet!"
/usr/include/freetype2/freetype/freetype.h:21:2: #error "Please always use macros to include FreeType header files."
/usr/include/freetype2/freetype/freetype.h:22:2: #error "Example:"
/usr/include/freetype2/freetype/freetype.h:23:2: #error " #include <ft2build.h>"
/usr/include/freetype2/freetype/freetype.h:24:2: #error " #include FT_FREETYPE_H"
make: *** [ft.o] Fel 1
This fixes it:
--- ft.c.orig 2005-08-21 23:56:29.016193172 +0200+++ ft.c 2005-08-21 23:49:01.000000000 +0200
@@ -12,6 +12,7 @@
#include <stdlib.h>
#include <ctype.h>
#include <sys/types.h>
+#include <ft2build.h>
#include <freetype/freetype.h>
#include <freetype/ftglyph.h>
#include <freetype/ftsnames.h>
UFPDF and ASP
Salut !
I would like to know if a ASP or JS version of UFPDF already exists ?
I use FPDF for ASP, and I would like to use an unicode UTF-8 extension for this version.
Has someone already converted UFPDF for PHP in a ASP version ?
Last question : does a free (or cheap) activeX/COM (unicode) PDF exist ?
Thank you very much for your help !
Congratulations for this work
Jay, Montpellier, France
doubt
hi i used FPDF ASP version to generate PDf dynamically...Now i need to generate it in japanese language if u have any idea's mail me to
ramya.dhamodaran@in.flextronics.com
thanks and regards
Ramya.d
ufpdf creates big files ?
hi,
UFPDF created files seems to be very bigger than FPDF created ones...
even « empty » creates 179k-file with arial.php-font added !
Is there technically a solution for this problem ;
or is it impossible to create Unicode PDF files not including the fonts ?
Thank you for your job !! :)
UTF-8 support in HTML2PDF
Does anyone succeed to embed the UFPDF UTF-8 functionality into HTML2PDF? I am trying to do this, but if someone has done, you wil lsave me a lot of time.
Thanks for UFPDF!