Since 2004 decode_entities() supports the merging of surrogate pairs. See http://rt.cpan.org/Ticket/Display.html?id=7785 . This means that for example �� will be decoded into a single code point. My understanding that this not covered in any spec.
I therefore propose to add a function decode_entities_strict() that does the same as decode_entities() but rejects surrogate pairs.
Attached is a sample script that shows the effect-
surrogate_pair.pl.txt