Commit | Line | Data |
---|---|---|
920dae64 AT |
1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> |
2 | <html> | |
3 | <head> | |
4 | <title>Scripting Languages</title> | |
5 | <link rel="stylesheet" type="text/css" href="style.css"> | |
6 | </head> | |
7 | ||
8 | <body bgcolor="#ffffff"> | |
9 | <H1><a name="Scripting"></a>4 Scripting Languages</H1> | |
10 | <!-- INDEX --> | |
11 | <div class="sectiontoc"> | |
12 | <ul> | |
13 | <li><a href="#Scripting_nn2">The two language view of the world</a> | |
14 | <li><a href="#Scripting_nn3">How does a scripting language talk to C?</a> | |
15 | <ul> | |
16 | <li><a href="#Scripting_nn4">Wrapper functions</a> | |
17 | <li><a href="#Scripting_nn5">Variable linking</a> | |
18 | <li><a href="#Scripting_nn6">Constants</a> | |
19 | <li><a href="#Scripting_nn7">Structures and classes</a> | |
20 | <li><a href="#Scripting_nn8">Proxy classes</a> | |
21 | </ul> | |
22 | <li><a href="#Scripting_nn9">Building scripting language extensions</a> | |
23 | <ul> | |
24 | <li><a href="#Scripting_nn10">Shared libraries and dynamic loading</a> | |
25 | <li><a href="#Scripting_nn11">Linking with shared libraries</a> | |
26 | <li><a href="#Scripting_nn12">Static linking</a> | |
27 | </ul> | |
28 | </ul> | |
29 | </div> | |
30 | <!-- INDEX --> | |
31 | ||
32 | ||
33 | ||
34 | <p> | |
35 | This chapter provides a brief overview of scripting language extension | |
36 | programming and the mechanisms by which scripting language interpreters | |
37 | access C and C++ code. | |
38 | </p> | |
39 | ||
40 | <H2><a name="Scripting_nn2"></a>4.1 The two language view of the world</H2> | |
41 | ||
42 | ||
43 | <p> | |
44 | When a scripting language is used to control a C program, the | |
45 | resulting system tends to look as follows: | |
46 | </p> | |
47 | ||
48 | <center><img src="ch2.1.png" alt="Scripting language input - C/C++ functions output"></center> | |
49 | ||
50 | <p> | |
51 | In this programming model, the scripting language interpreter is used | |
52 | for high level control whereas the underlying functionality of the | |
53 | C/C++ program is accessed through special scripting language | |
54 | "commands." If you have ever tried to write your own simple command | |
55 | interpreter, you might view the scripting language approach | |
56 | to be a highly advanced implementation of that. Likewise, | |
57 | If you have ever used a package such as MATLAB or IDL, it is a | |
58 | very similar model--the interpreter executes user commands and | |
59 | scripts. However, most of the underlying functionality is written in | |
60 | a low-level language like C or Fortran. | |
61 | </p> | |
62 | ||
63 | <p> | |
64 | The two-language model of computing is extremely powerful because it | |
65 | exploits the strengths of each language. C/C++ can be used for maximal | |
66 | performance and complicated systems programming tasks. Scripting | |
67 | languages can be used for rapid prototyping, interactive debugging, | |
68 | scripting, and access to high-level data structures such associative | |
69 | arrays. </p> | |
70 | ||
71 | <H2><a name="Scripting_nn3"></a>4.2 How does a scripting language talk to C?</H2> | |
72 | ||
73 | ||
74 | <p> | |
75 | Scripting languages are built around a parser that knows how | |
76 | to execute commands and scripts. Within this parser, there is a | |
77 | mechanism for executing commands and accessing variables. | |
78 | Normally, this is used to implement the builtin features | |
79 | of the language. However, by extending the interpreter, it is usually | |
80 | possible to add new commands and variables. To do this, | |
81 | most languages define a special API for adding new commands. | |
82 | Furthermore, a special foreign function interface defines how these | |
83 | new commands are supposed to hook into the interpreter. | |
84 | </p> | |
85 | ||
86 | <p> | |
87 | Typically, when you add a new command to a scripting interpreter | |
88 | you need to do two things; first you need to write a special | |
89 | "wrapper" function that serves as the glue between the interpreter | |
90 | and the underlying C function. Then you need to give the interpreter | |
91 | information about the wrapper by providing details about the name of the | |
92 | function, arguments, and so forth. The next few sections illustrate | |
93 | the process. | |
94 | </p> | |
95 | ||
96 | <H3><a name="Scripting_nn4"></a>4.2.1 Wrapper functions</H3> | |
97 | ||
98 | ||
99 | <p> | |
100 | Suppose you have an ordinary C function like this :</p> | |
101 | ||
102 | <div class="code"><pre> | |
103 | int fact(int n) { | |
104 | if (n <= 1) return 1; | |
105 | else return n*fact(n-1); | |
106 | } | |
107 | </pre></div> | |
108 | ||
109 | <p> | |
110 | In order to access this function from a scripting language, it is | |
111 | necessary to write a special "wrapper" function that serves as the | |
112 | glue between the scripting language and the underlying C function. A | |
113 | wrapper function must do three things :</p> | |
114 | ||
115 | <ul> | |
116 | <li>Gather function arguments and make sure they are valid. | |
117 | <li>Call the C function. | |
118 | <li>Convert the return value into a form recognized by the scripting language. | |
119 | </ul> | |
120 | ||
121 | <p> | |
122 | As an example, the Tcl wrapper function for the <tt>fact()</tt> | |
123 | function above example might look like the following : </p> | |
124 | ||
125 | <div class="code"><pre> | |
126 | int wrap_fact(ClientData clientData, Tcl_Interp *interp, | |
127 | int argc, char *argv[]) { | |
128 | int result; | |
129 | int arg0; | |
130 | if (argc != 2) { | |
131 | interp->result = "wrong # args"; | |
132 | return TCL_ERROR; | |
133 | } | |
134 | arg0 = atoi(argv[1]); | |
135 | result = fact(arg0); | |
136 | sprintf(interp->result,"%d", result); | |
137 | return TCL_OK; | |
138 | } | |
139 | ||
140 | </pre></div> | |
141 | ||
142 | <p> | |
143 | Once you have created a wrapper function, the final step is to tell the | |
144 | scripting language about the new function. This is usually done in an | |
145 | initialization function called by the language when the module is | |
146 | loaded. For example, adding the above function to the Tcl interpreter | |
147 | requires code like the following :</p> | |
148 | ||
149 | <div class="code"><pre> | |
150 | int Wrap_Init(Tcl_Interp *interp) { | |
151 | Tcl_CreateCommand(interp, "fact", wrap_fact, (ClientData) NULL, | |
152 | (Tcl_CmdDeleteProc *) NULL); | |
153 | return TCL_OK; | |
154 | } | |
155 | </pre></div> | |
156 | ||
157 | <p> | |
158 | When executed, Tcl will now have a new command called "<tt>fact</tt>" | |
159 | that you can use like any other Tcl command.</p> | |
160 | ||
161 | <p> | |
162 | Although the process of adding a new function to Tcl has been | |
163 | illustrated, the procedure is almost identical for Perl and | |
164 | Python. Both require special wrappers to be written and both need | |
165 | additional initialization code. Only the specific details are | |
166 | different.</p> | |
167 | ||
168 | <H3><a name="Scripting_nn5"></a>4.2.2 Variable linking</H3> | |
169 | ||
170 | ||
171 | <p> | |
172 | Variable linking refers to the problem of mapping a | |
173 | C/C++ global variable to a variable in the scripting | |
174 | language interpeter. For example, suppose you had the following | |
175 | variable:</p> | |
176 | ||
177 | <div class="code"><pre> | |
178 | double Foo = 3.5; | |
179 | </pre></div> | |
180 | ||
181 | <p> | |
182 | It might be nice to access it from a script as follows (shown for Perl):</p> | |
183 | ||
184 | <div class="targetlang"><pre> | |
185 | $a = $Foo * 2.3; # Evaluation | |
186 | $Foo = $a + 2.0; # Assignment | |
187 | </pre></div> | |
188 | ||
189 | <p> | |
190 | To provide such access, variables are commonly manipulated using a | |
191 | pair of get/set functions. For example, whenever the value of a | |
192 | variable is read, a "get" function is invoked. Similarly, whenever | |
193 | the value of a variable is changed, a "set" function is called. | |
194 | </p> | |
195 | ||
196 | <p> | |
197 | In many languages, calls to the get/set functions can be attached to | |
198 | evaluation and assignment operators. Therefore, evaluating a variable | |
199 | such as <tt>$Foo</tt> might implicitly call the get function. Similarly, | |
200 | typing <tt>$Foo = 4</tt> would call the underlying set function to change | |
201 | the value. | |
202 | </p> | |
203 | ||
204 | <H3><a name="Scripting_nn6"></a>4.2.3 Constants</H3> | |
205 | ||
206 | ||
207 | <p> | |
208 | In many cases, a C program or library may define a large collection of | |
209 | constants. For example: | |
210 | </p> | |
211 | ||
212 | <div class="code"><pre> | |
213 | #define RED 0xff0000 | |
214 | #define BLUE 0x0000ff | |
215 | #define GREEN 0x00ff00 | |
216 | </pre></div> | |
217 | <p> | |
218 | To make constants available, their values can be stored in scripting | |
219 | language variables such as <tt>$RED</tt>, <tt>$BLUE</tt>, and | |
220 | <tt>$GREEN</tt>. Virtually all scripting languages provide C | |
221 | functions for creating variables so installing constants is usually | |
222 | a trivial exercise. | |
223 | </p> | |
224 | ||
225 | <H3><a name="Scripting_nn7"></a>4.2.4 Structures and classes</H3> | |
226 | ||
227 | ||
228 | <p> | |
229 | Although scripting languages have no trouble accessing simple | |
230 | functions and variables, accessing C/C++ structures and classes | |
231 | present a different problem. This is because the implementation | |
232 | of structures is largely related to the problem of | |
233 | data representation and layout. Furthermore, certain language features | |
234 | are difficult to map to an interpreter. For instance, what | |
235 | does C++ inheritance mean in a Perl interface? | |
236 | </p> | |
237 | ||
238 | <p> | |
239 | The most straightforward technique for handling structures is to | |
240 | implement a collection of accessor functions that hide the underlying | |
241 | representation of a structure. For example, | |
242 | </p> | |
243 | ||
244 | <div class="code"><pre> | |
245 | struct Vector { | |
246 | Vector(); | |
247 | ~Vector(); | |
248 | double x,y,z; | |
249 | }; | |
250 | ||
251 | </pre></div> | |
252 | ||
253 | <p> | |
254 | can be transformed into the following set of functions : | |
255 | </p> | |
256 | ||
257 | <div class="code"><pre> | |
258 | Vector *new_Vector(); | |
259 | void delete_Vector(Vector *v); | |
260 | double Vector_x_get(Vector *v); | |
261 | double Vector_y_get(Vector *v); | |
262 | double Vector_z_get(Vector *v); | |
263 | void Vector_x_set(Vector *v, double x); | |
264 | void Vector_y_set(Vector *v, double y); | |
265 | void Vector_z_set(Vector *v, double z); | |
266 | ||
267 | </pre></div> | |
268 | <p> | |
269 | Now, from an interpreter these function might be used as follows: | |
270 | </p> | |
271 | ||
272 | <div class="targetlang"><pre> | |
273 | % set v [new_Vector] | |
274 | % Vector_x_set $v 3.5 | |
275 | % Vector_y_get $v | |
276 | % delete_Vector $v | |
277 | % ... | |
278 | </pre></div> | |
279 | ||
280 | <p> | |
281 | Since accessor functions provide a mechanism for accessing the | |
282 | internals of an object, the interpreter does not need to know anything | |
283 | about the actual representation of a <tt>Vector</tt>. | |
284 | </p> | |
285 | ||
286 | <H3><a name="Scripting_nn8"></a>4.2.5 Proxy classes</H3> | |
287 | ||
288 | ||
289 | <p> | |
290 | In certain cases, it is possible to use the low-level accessor functions | |
291 | to create a proxy class, also known as a shadow class. | |
292 | A proxy class is a special kind of object that gets created | |
293 | in a scripting language to access a C/C++ class (or struct) in a way | |
294 | that looks like the original structure (that is, it proxies the real | |
295 | C++ class). For example, if you | |
296 | have the following C definition :</p> | |
297 | ||
298 | <div class="code"><pre> | |
299 | class Vector { | |
300 | public: | |
301 | Vector(); | |
302 | ~Vector(); | |
303 | double x,y,z; | |
304 | }; | |
305 | </pre></div> | |
306 | ||
307 | <p> | |
308 | A proxy classing mechanism would allow you to access the structure in | |
309 | a more natural manner from the interpreter. For example, in Python, you might want to do this: | |
310 | </p> | |
311 | ||
312 | <div class="targetlang"><pre> | |
313 | >>> v = Vector() | |
314 | >>> v.x = 3 | |
315 | >>> v.y = 4 | |
316 | >>> v.z = -13 | |
317 | >>> ... | |
318 | >>> del v | |
319 | </pre></div> | |
320 | ||
321 | <p> | |
322 | Similarly, in Perl5 you may want the interface to work like this:</p> | |
323 | ||
324 | <div class="targetlang"><pre> | |
325 | $v = new Vector; | |
326 | $v->{x} = 3; | |
327 | $v->{y} = 4; | |
328 | $v->{z} = -13; | |
329 | ||
330 | </pre></div> | |
331 | <p> | |
332 | Finally, in Tcl : | |
333 | </p> | |
334 | ||
335 | <div class="targetlang"><pre> | |
336 | Vector v | |
337 | v configure -x 3 -y 4 -z 13 | |
338 | ||
339 | </pre></div> | |
340 | ||
341 | <p> | |
342 | When proxy classes are used, two objects are at really work--one in | |
343 | the scripting language, and an underlying C/C++ object. Operations | |
344 | affect both objects equally and for all practical purposes, it appears | |
345 | as if you are simply manipulating a C/C++ object. | |
346 | </p> | |
347 | ||
348 | <H2><a name="Scripting_nn9"></a>4.3 Building scripting language extensions</H2> | |
349 | ||
350 | ||
351 | <p> | |
352 | The final step in using a scripting language with your C/C++ | |
353 | application is adding your extensions to the scripting language | |
354 | itself. There are two primary approaches for doing | |
355 | this. The preferred technique is to build a dynamically loadable | |
356 | extension in the form a shared library. Alternatively, you can | |
357 | recompile the scripting language interpreter with your extensions | |
358 | added to it. | |
359 | </p> | |
360 | ||
361 | <H3><a name="Scripting_nn10"></a>4.3.1 Shared libraries and dynamic loading</H3> | |
362 | ||
363 | ||
364 | <p> | |
365 | To create a shared library or DLL, you often need to look at the | |
366 | manual pages for your compiler and linker. However, the procedure | |
367 | for a few common machines is shown below:</p> | |
368 | ||
369 | <div class="shell"><pre> | |
370 | # Build a shared library for Solaris | |
371 | gcc -c example.c example_wrap.c -I/usr/local/include | |
372 | ld -G example.o example_wrap.o -o example.so | |
373 | ||
374 | # Build a shared library for Linux | |
375 | gcc -fpic -c example.c example_wrap.c -I/usr/local/include | |
376 | gcc -shared example.o example_wrap.o -o example.so | |
377 | ||
378 | # Build a shared library for Irix | |
379 | gcc -c example.c example_wrap.c -I/usr/local/include | |
380 | ld -shared example.o example_wrap.o -o example.so | |
381 | ||
382 | </pre></div> | |
383 | ||
384 | <p> | |
385 | To use your shared library, you simply use the corresponding command | |
386 | in the scripting language (load, import, use, etc...). This will | |
387 | import your module and allow you to start using it. For example: | |
388 | </p> | |
389 | ||
390 | <div class="targetlang"><pre> | |
391 | % load ./example.so | |
392 | % fact 4 | |
393 | 24 | |
394 | % | |
395 | </pre></div> | |
396 | ||
397 | <p> | |
398 | When working with C++ codes, the process of building shared libraries | |
399 | may be more complicated--primarily due to the fact that C++ modules may need | |
400 | additional code in order to operate correctly. On many machines, you | |
401 | can build a shared C++ module by following the above procedures, but | |
402 | changing the link line to the following :</p> | |
403 | ||
404 | <div class="shell"><pre> | |
405 | c++ -shared example.o example_wrap.o -o example.so | |
406 | </pre></div> | |
407 | ||
408 | <H3><a name="Scripting_nn11"></a>4.3.2 Linking with shared libraries</H3> | |
409 | ||
410 | ||
411 | <p> | |
412 | When building extensions as shared libraries, it is not uncommon for | |
413 | your extension to rely upon other shared libraries on your machine. In | |
414 | order for the extension to work, it needs to be able to find all of | |
415 | these libraries at run-time. Otherwise, you may get an error such as | |
416 | the following :</p> | |
417 | ||
418 | <div class="targetlang"><pre> | |
419 | >>> import graph | |
420 | Traceback (innermost last): | |
421 | File "<stdin>", line 1, in ? | |
422 | File "/home/sci/data1/beazley/graph/graph.py", line 2, in ? | |
423 | import graphc | |
424 | ImportError: 1101:/home/sci/data1/beazley/bin/python: rld: Fatal Error: cannot | |
425 | successfully map soname 'libgraph.so' under any of the filenames /usr/lib/libgraph.so:/ | |
426 | lib/libgraph.so:/lib/cmplrs/cc/libgraph.so:/usr/lib/cmplrs/cc/libgraph.so: | |
427 | >>> | |
428 | </pre></div> | |
429 | <p> | |
430 | ||
431 | What this error means is that the extension module created by SWIG | |
432 | depends upon a shared library called "<tt>libgraph.so</tt>" that the | |
433 | system was unable to locate. To fix this problem, there are a few | |
434 | approaches you can take.</p> | |
435 | ||
436 | <ul> | |
437 | <li>Link your extension and explicitly tell the linker where the | |
438 | required libraries are located. Often times, this can be done with a | |
439 | special linker flag such as <tt>-R</tt>, <tt>-rpath</tt>, etc. This | |
440 | is not implemented in a standard manner so read the man pages for your | |
441 | linker to find out more about how to set the search path for shared | |
442 | libraries. | |
443 | ||
444 | <li>Put shared libraries in the same directory as the executable. This | |
445 | technique is sometimes required for correct operation on non-Unix | |
446 | platforms. | |
447 | ||
448 | <li>Set the UNIX environment variable <tt>LD_LIBRARY_PATH</tt> to the | |
449 | directory where shared libraries are located before running Python. | |
450 | Although this is an easy solution, it is not recommended. Consider setting | |
451 | the path using linker options instead. | |
452 | ||
453 | </ul> | |
454 | ||
455 | <H3><a name="Scripting_nn12"></a>4.3.3 Static linking</H3> | |
456 | ||
457 | ||
458 | <p> | |
459 | With static linking, you rebuild the scripting language interpreter | |
460 | with extensions. The process usually involves compiling a short main | |
461 | program that adds your customized commands to the language and starts | |
462 | the interpreter. You then link your program with a library to produce | |
463 | a new scripting language executable. | |
464 | </p> | |
465 | ||
466 | <p> | |
467 | Although static linking is supported on all platforms, this is not | |
468 | the preferred technique for building scripting language | |
469 | extensions. In fact, there are very few practical reasons for doing this--consider | |
470 | using shared libraries instead. | |
471 | </p> | |
472 | ||
473 | </body> | |
474 | </html> |