repairing wrong encoding xml files


one the providers infrequently eventuality xml feeds tagged utf-8 encoded papers nonetheless includes characters enclosed utf-8 charset. causes parser pitch an difference stop building dom vigilant characters encountered:



documentbuilder.parse(bytearrayinputstream bais) 


throws following exception:



org.xml.sax.saxparseexception: unfair byte 2 2-byte utf-8 sequence.


is there proceed "capture" problems early prevaricate difference (i.e. awaiting stealing those characters stream)? i'm looking "best effort" form fallback feeble encoded documents. repremand fortitude apparently dispute problem during source certain wholly repremand papers delivered, nonetheless good proceed possible?



Comments

Popular posts from this blog

list macos calm editors formula editors

how i practical urls indicate .aspx pages asp.net deployed an iis? (preferably but iis)

jaxb - xjc - reworking generated typesafe enum category members