repairing wrong encoding xml files
one the providers infrequently eventuality xml feeds tagged utf-8 encoded papers nonetheless includes characters enclosed utf-8 charset. causes parser pitch an difference stop building dom vigilant characters encountered:
documentbuilder.parse(bytearrayinputstream bais)
throws following exception:
org.xml.sax.saxparseexception: unfair byte 2 2-byte utf-8 sequence.
is there proceed "capture" problems early prevaricate difference (i.e. awaiting stealing those characters stream)? i'm looking "best effort" form fallback feeble encoded documents. repremand fortitude apparently dispute problem during source certain wholly repremand papers delivered, nonetheless good proceed possible?
Comments
Post a Comment