summaryrefslogtreecommitdiffstats
path: root/tools/jar/README.jar-tools
blob: cf0d8011ee87f00ad4e7f9b542b37b3472961225 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
Q: What does jar-query do?

A: It scans a set of directories containing jars, it opens every jar
   it finds and records the classes in that jar. It stores that
   information and permits queries to be run against it.

Q: What kind of information can jar-query give me?

A: * Given a class list which jars provide it

   * Given an import specification list which jars provide the
     classes which match the specification

   * List which classes appear in more than one jar
     (e.g. multiple definitions) and which jars they appear in.

   * For classes with multiple definitions determine which classes have
     the same implementation (e.g. a copy) and which have different
     implementations (typically different versions).

   * Show the symbolic links which point to a given jar (full link
     traversal).

   * List which RPM provides a given jar.

   * List closed set of which jars and RPM's are necessary to resolve
     a set of classes (e.g. what jars/RPMs are necessary to build/run)

Q: Can jar-query give me information about jars not installed on my
   system? 

A: No. If the jar isn't installed or isn't in the set of directories
   jar-query is told to scan no information will be available for
   those classes. This is a little bit of a chicken and egg problem,
   you might not know you need to install a jar that contains a class
   you need. Not much can be done about that though.

Q: What does java-imports do?

A: It locates a set of Java source files in one or more directories
   and extracts the import specifications from each source file. It
   then lists on stdout the set of unique import specifications. This
   is useful as input to jar-query.

Q: What kind of information can java-imports give me?

A: * The unique import specifications across a collection of java
     files.

   * The list of java files each unique import specification appears
     in.

   * The list of java files given a set of directories and exclude
     filters (e.g. the files it will scan)

Q: How do I control which java files java-imports scans?

A: There are 3 basic controls. The set of paths provided on the
   command line. A path may be either a plain java file or a
   directory. The -r (--recursive) argument controls whether directories
   are recursively scanned or not. One or more exclude regular
   expressions can be provided via the -x (--exclude) argument. The
   regular expression is tested against each path candidate, if any of
   the exclude regular expressions match the path is discarded.

Q: Which directories does jar-query scan and can I modify that?

A: By default jar-query scans the system default jar directory and the
   system default jni directory. Running jar-query with the help
   argument (-h) will print these out. You can add any number of
   additional directories to scan with the -d (--dir) argument. If you
   don't want to include the default directories you can specify the
   -D (--clear-dirs) argument which will zero out the existing
   directory list, then add your directories with one or more -d
   arguments.

Q: I want jar-query to ignore some jars, can I do that?

A: Yes. Use the -x (--exclude) argument. It is a regular expression
   pattern applied to a jar path name, if any of the exclude regular
   expressions match the jar will be ignored. Multiple exclude
   patterns may be specified.

Q: How does jar-query handle symbolic links?

A: It's common for a directory to have symbolic links which point jar
   files. Typically this occurs when an unversioned name
   (e.g. foo.jar) points to a specific jar version
   (e.g. foo-1.2.jar). Sometimes links are established for backward
   compatibility when jar names change.

   jar-query is designed to tell you the ACTUAL jar file a class is
   located in. Which one of the (many) links which point to it are
   usually not of interest and would complicate the
   reporting. Therefore jar-query never gives link names, it always
   does a full link traversal and reports only the ACTUAL jar
   file. However, sometimes it's useful to know how an ACTUAL jar file
   is pointed to by various links. You can use the -L (--links)
   argument which will dump out the link traversal information for
   every ACTUAL jar file located.

Q: How are class names matched in jar-query?

A: By default the match is done as if the class is a import
   specification with support for wildcards
   (e.g. org.company.util.*). If the -R (--regexp) argument is
   provided matches are done using a general purpose regular
   expression. In the special case of interactive use class names will
   auto-complete (via TAB) up to the next dot.

Q: jar-query must build a database of class information each time it's
   run, that's a somewhat expensive operation and the data seldom
   changes. Can I put jar-query in a mode where after it builds it's
   database it sits waiting for me to enter individual queries?

A: Yes. Use the -I (--interactive) argument. After the database is
   built it will prompt you on the command line for a class to
   query. You may use TAB to auto-complete the class name. Each TAB
   will complete up to the next dot (.) in the class path.

Tutorial examples of how to use these tools:
--------------------------------------------

Let's say we have Java application and need to know which jars must be
present to satisfy the class loading. Here is how you might tackle
that problem. We'll use the example of pki-core. First we need to
determine the imports used in the source code, java-imports can do
this for us. We might do something like this:

$ java-imports                       \
  -x /test/                          \
  -r                                 \
  ~/src/dogtag/pki/base/setup        \
  ~/src/dogtag/pki/base/symkey       \
  ~/src/dogtag/pki/base/native-tools \
  ~/src/dogtag/pki/base/util         \
  ~/src/dogtag/pki/base/java-tools   \
  ~/src/dogtag/pki/base/common       \
  ~/src/dogtag/pki/base/selinux      \
  ~/src/dogtag/pki/base/ca           \
  ~/src/dogtag/pki/base/silent       \
  > ~/pkicore-imports

This instructs java-imports to recursively scan (-r) the set of source
code directories comprising pki-core, but exclude any java file in a
test directory. The result is written to ~/pkicore-imports and we'll
show you a partial snippet below:

com.netscape.certsrv.*
com.netscape.certsrv.acls.*
com.netscape.certsrv.apps.*
com.netscape.certsrv.apps.CMS
com.netscape.certsrv.authentication.*
com.netscape.certsrv.authentication.AuthCredentials
com.netscape.certsrv.authentication.AuthToken

Now we want to know which jars and RPM's provide those classes,
jar-query will help do this. Let's develop a strategy. As a first cut
we could do this:

$ jar-query -d /usr/share/java/pki `cat ~/pkicore-imports`

This adds the pki specific jar directory to the jar search path and
performs a query for every import statement we located earlier. Looking
at the output we see some immediate problems, there are more than one
jar providing some of the classes, which one do we want?

If we add the -m argument that will list only classes which have
multiple definitions and which jars they occur in.

$ jar-query -m -d /usr/share/java/pki `cat ~/pkicore-imports`

Some examples might be:

org.w3c.dom.Document
    /usr/share/java/libgcj-4.4.4.jar
    /usr/share/java/xml-commons-apis-1.4.01.jar

com.ibm.wsdl.util.StringUtils
    /usr/share/java/qname-1.5.2.jar
    /usr/share/java/wsdl4j-1.5.2.jar

junit.framework.TestCase
    /usr/share/java/junit-3.8.2.jar
    /usr/share/java/junit4-4.6.jar

O.K. so we run jar-query again and ask it to compare the class
implementations for the duplicates using the -M argument

$ jar-query -M -d /usr/share/java/pki `cat ~/pkicore-imports`

For the above examples this is what it reports:

comparing org.w3c.dom.Document
        equal /usr/share/java/libgcj-4.4.4.jar /usr/share/java/xml-commons-apis-1.4.01.jar

comparing com.ibm.wsdl.util.StringUtils
        equal /usr/share/java/qname-1.5.2.jar /usr/share/java/wsdl4j-1.5.2.jar

comparing junit.framework.TestCase
    not equal /usr/share/java/junit-3.8.2.jar /usr/share/java/junit4-4.6.jar

One thing to notice is that libgcj appears frequently in the duplicate
list and is somewhat of a kitchen sink providing copies of many
classes. We never explicitly include libgcj directly anyway. So let's
exclude libgcj from consideration by providing this argument to
jar-query: -x libgcj the next time.

qname-1.5.2.jar and wsdl4j-1.5.2.jar both provide copies of the same
classes, they both have the same version number, thus we can conclude
they are synonyms for one another and we should pick which we'll use.

Ah, but junit-3.8.2.jar and junit4-4.6.jar are not providing the same
implementation of the class and we notice they have different versions
embedded in their jar names. Thus we can conclude multiple versions of
a jar have been installed and we must be careful to pick the jar whose
version matches our needs.

O.K. So armed with this knowledge lets try it again, we'll exclude the
libgcj jar (-x libgcj) and ask for RPM information (-r) and summary
information (-s):

$ jar-query -s -r -d /usr/share/java/pki -x libgcj `cat ~/pkicore-imports`

The summary is listed below:

Summary:
21 Unique Jar's
     /usr/lib/jss/jss4-4.2.6.jar
     /usr/share/java/commons-codec.jar
     /usr/lib/symkey/symkey-9.0.0.jar
     /usr/share/java/jakarta-taglibs-core-1.1.1.jar
     /usr/share/java/ldapbeans-4.18.jar
     /usr/share/java/ldapfilt-4.18.jar
     /usr/share/java/ldapjdk-4.18.jar
     /usr/share/java/pki-console-2.0.0.jar
     /usr/share/java/pki/certsrv-9.0.0.jar
     /usr/share/java/pki/cms-9.0.0.jar
     /usr/share/java/pki/cmscore-9.0.0.jar
     /usr/share/java/pki/cmsutil-9.0.0.jar
     /usr/share/java/pki/nsutil-9.0.0.jar
     /usr/share/java/tomcat5-jsp-2.0-api-5.5.27.jar
     /usr/share/java/tomcat5-servlet-2.4-api-5.5.27.jar
     /usr/share/java/tomcat6-jsp-2.1-api-6.0.26.jar
     /usr/share/java/tomcat6-servlet-2.5-api-6.0.26.jar
     /usr/share/java/velocity-1.6.3.jar
     /usr/share/java/xerces-j2-2.9.0.jar
     /usr/share/java/xml-commons-apis-1.4.01.jar
     /usr/share/java/xml-commons-apis-ext-1.4.01.jar
15 Unique RPM's
     jakarta-taglibs-standard
     jss
     ldapjdk
     apache-commons-codec
     pki-server
     pki-console
     pki-symkey
     pki-base
     tomcat5-jsp-2.0-api
     tomcat5-servlet-2.4-api
     tomcat6-jsp-2.1-api
     tomcat6-servlet-2.5-api
     velocity
     xerces-j2
     xml-commons-apis

What this is telling us is that there are 21 jars which provide all
the classes needed to satisfy the import statements. However there may
be some duplicates in the list. Also because of wildcard import
statements some classes and hence jars may have been included which
are not actually utilized in the code.

Those 21 jars are provided by the 15 RPM's listed. Once again there
may be some class duplication and/or unnecessary RPM's due to
wildcarding. But this gives a very small manageable list to manually
pick through and make our choices. For example, one thing we can
immediately see is that both tomcat5-servlet-2.4-api and
tomcat6-servlet-2.5-api appear in the list, we clearly need only one
version and pick the one for the version of tomcat we're targeting,
hence tomcat6-servlet-2.5-api, same applies to tomcat6-jsp-2.1-api.