
Troubleshooting: Site content crawler errors
The site content crawler might encounter errors in WebSphere Commerce.
Problem
Missing Bouncy Castle JAR file errors occur running the site content crawler. These errors might occur when unencrypting site content such as PDF files.
An error similar
to the following occurs:
00000e44 DataImporter E org.apache.solr.common.SolrException log Full Import failed:
org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.NoClassDefFoundError: org.bouncycastle.jce.provider.BouncyCastleProvider
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:669)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:622)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
Caused by: java.lang.NoClassDefFoundError: org.bouncycastle.jce.provider.BouncyCastleProvider
at java.lang.J9VMInternals.verifyImpl(Native Method)
at java.lang.J9VMInternals.verify(J9VMInternals.java:72)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:134)
at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1324)
at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:796)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:89)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at com.ibm.commerce.solr.handler.TikaEntityProcessor.load(TikaEntityProcessor.java:276)
at com.ibm.commerce.solr.handler.TikaEntityProcessor.initConnection(TikaEntityProcessor.java:182)
at com.ibm.commerce.solr.handler.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:238)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:596)
... 6 more
Caused by: java.lang.ClassNotFoundException: org.bouncycastle.jce.provider.BouncyCastleProvider
at java.net.URLClassLoader.findClass(URLClassLoader.java:423)
at com.ibm.ws.bootstrap.ExtClassLoader.findClass(ExtClassLoader.java:191)
at java.lang.ClassLoader.loadClass(ClassLoader.java:660)
at com.ibm.ws.bootstrap.ExtClassLoader.loadClass(ExtClassLoader.java:111)
at java.lang.ClassLoader.loadClass(ClassLoader.java:626)
at com.ibm.ws.classloader.ProtectionClassLoader.loadClass(ProtectionClassLoader.java:62)
at com.ibm.ws.classloader.ProtectionClassLoader.loadClass(ProtectionClassLoader.java:58)
at com.ibm.ws.classloader.CompoundClassLoader.loadClass(CompoundClassLoader.java:511)
at java.lang.ClassLoader.loadClass(ClassLoader.java:626)
at com.ibm.ws.classloader.CompoundClassLoader.loadClass(CompoundClassLoader.java:511)
at java.lang.ClassLoader.loadClass(ClassLoader.java:626)
... 20 more
Solution
Ensure that the crawler is not missing any JAR files that are required for the site content to crawl.
For example, download missing JAR files such as bcprov-jdk15.jar and bcmail-jdb15.jar from Bouncy Castle.