2009-10-29

Visual C++ ... MD and MT compile option

This article is only for limited audiences. If you use Visual Studio C++ (2008) and have a question about What is the MD and MT, then you might be interested in this.

When I made a Console Application on the Visual C++ 2008 using STL. I start to get the following error.

error LNK2001: unresolved external symbol __imp___CrtDbgReportW

It is related with a Windows runtime library. One thing is Unicode related, so I switched off.

It seems this is related C/C++ Code generation, Multi-threaded Debug Dll (MT option) (/Mdd) or Multi-threaded debug (/MTd). (Note: only one character difference /MDd and /MTd.) In some reason while I develop an application, I got a warning about there is a library conflict, and suggested to ignore one of the runtime libraries. I followed the link's output and ignore the 'MSVCRT' library. But this is the problem. I removed it from the ignore library list.

I set up all the Code Generation to Multi-threaded Debug DLL (/MDd) for debug, and not ignore any runtime library. Then now it works. This was a confusion.

Then, I asked my colleague, T. What is this MD and MT stuffs. He said:

- MT: Multi-threaded static library. This is good for distribution a whole application package, since everything is statically linked and there is no dependency to the environment.

- MD: Multi-threaded Dynamic link library. If you distribute some library instead of a whole application, dynamically linked runtime is better since, some part of the program uses one version of runtime, and the other uses different version,this causes a trouble. For example, 'new'ed some memory at one version of runtime, and 'delete'd by other version of runtime could be a trouble.

So, it depends on the situation and I see now it is better to have both. But these are too similar and I am easy to confuse them.

2009-10-27

Is spamming really profittable?

Spam mails are everywhere and everyday. There is a question, "Is spamming really profitable?" One "academic" paper try to answer this question. They set up the bot net, set up fake web pages, send spam mails (350million spam emails in their experience), observe how many can go through, how many comes to the web page, and how many actually buy the medicine. This is the paper: Kanich et al. Spamalytics: an empirical analysis of spam marketing conversion. CASM, 2009, Vol.52, No.9, pp.99-107

I do not know how, but they claims it they cleared U.S. legal doctrine and ethics.

The authors said, "We would be the first to admit that these results represents a single data point and are not necessarily representative of spam as a whole." and they even said there is some danger to interpret the result's means. But, for 28 days experience, 350 million spam mails are sent, 82 million delivered (but, of course they do not know how many are shown up in user's inbox, most of them might be just filtered to the spam folder.), more than 10,000 web page visitors, and 28 purchased the medicine. The transaction was 2800 dollars. 100 dollars/day transaction. Their method only uses 1.5 % of the botnet ability, so, 7000 to 9500 dollars/day transaction are extrapolated. Of course there is a machine and human cost and also the business cost (if the spammer does the business) are not included, but, it could be profitable. This is an interesting paper.

2009-10-18

What is the rank of matrix

I will talk about the part of Introduction to Linear Algebra by Gilbert Strang, again.

There is an important indicator of matrix called "rank". According to the http://mathworld.wolfram.com/, the rank of a matrix is the dimension of the image of the matrix, corresponding to the number of linearly independent rows or columns of the matrix. This is a common definition in a textbook.

In the Strang book, the rank is explained from a bit different way. He explained rank and elimination together. The definition of rank is the number of pivots after the elimination. After all, they are the same, but I found interesting that how to approach to the concept of rank through the elimination.

Gaussian elimination is employed to diagonalize of a matrix. Because if we could diagonalize a matrix, we can easily solve the linear equations. Let's think about a bit of elimination. We could use the following matrix as an example matrix.

A =
|1 2 2 2|
|2 4 6 8|
|3 6 8 10|

Let's do elimination.

Subtract row 1st x 2 from row 2nd.

|1 2 2 2|
|0 0 2 4|
|3 6 8 10|

Subtract row 1st x 3 from row 3rd.

|1 2 2 2|
|0 0 2 4|
|0 0 2 4|

Subtract row 2nd from row 3rd.

|1 2 2 2|
|0 0 2 4|
|0 0 0 0|

Only two rows remain. The number of remaining rows are the rank, therefore, rank of this matrix is 2.

Let's think about a bit more what was done. Let's consider the matrix is a simultaneous equations. Then, the matrix

|1 2 2 2|
|2 4 6 8|
|3 6 8 10|

is

1 x1 + 2 x2 + 2 x3 + 2 x4 = b1 ... (1.1)
2 x1 + 4 x2 + 6 x3 + 8 x4 = b2 ... (1.2)
3 x1 + 6 x2 + 8 x3 + 10 x4 = b3 ... (1.3)

where, x1, x2, and x3 are unknowns and b1, b2, and b3 are the right hand side.

Elimination does a subtract or an add one equation from the others with multiplied by a constant. After these operations, there remains only "minimal number of equations in the matrix." You could ask me "what is the maximal or minimal number of equations in the matrix." It is also called "independent equations," but, the minimal number of equations means that how many equations are there in the matrix. You would say, that was three. But it is not always true, since you can make equations from the others.

The following (2.x) are created by elimination of (1.x).

1 x1 + 2 x2 + 2 x3 + 2 x4 = b'1 ... (2.1)
2 x3 + 4 x4 = b'2 ... (2.2)

The minimal number of equation means, if we have these (2.1) and (2.2), then we could make (1.1), (1.2), and (1.3). Therefore, (2.x) are the minimal set. In other words, the minimal set of equation to make (1.1), (1.2), and (1.3), are (2.1) and (2.2). For example, (2.1) * 2 + (2.2) makes (1.2). We could make more equations from (2.1). When we doubled (2.1)

1 x1 + 2 x2 + 2 x3 + 2 x4 = b1,

then,

2 x1 + 4 x2 + 4 x3 + 4 x4 = 2 b1.

We could multiply this three times or four times. But there doesn't create any new equations. Let's think about the simplest one.

The solution of

2 x = 6

is x = 3. If we doubled the equation, we could get

4 x = 12,

but the solution is still x = 3. These are all the same equation to find the x. we could use 2x = 6 to make 4 x = 12. There is no new information in these equations. Actually all can be made from x = 3.

We could make infinite number of equations based on a few number of equations by multiplying constants and adding them. In this case, the important thing is what is the minimal set of the equations. That's the "identity" equations of the matrix (not the identity matrix). A complex matrix could be actually a simple, but it just might look like a huge matrix. To understand a matrix, we should find the identity equations of the matrix. Elimination is the tool to find it out.

Now I could answer the question of the title, what is the rank of matrix. Matrix rank is the number of identity equation of the matrix. How many equations actually the matrix has, that is the rank. We can not simplify a matrix less than its rank. That's the reason, matrix rank is important.

「行列の正体見たり rank 数」斉

(This is a Senryu in Japanese, similar form of Haiku. But Haiku needs the words about the season by definition, and this does not have that. A translations could be "I see what you are, matrix, this is your rank." Hitoshi)


By the way, the rank of Laplacian matrix in differential geometry is n-1 when the size of matrix is n by n. The Laplacian matrix can be considered a local differentiation. There is one fix point is missing to determine the whole points. (The right hand side is differencial coordinate.) As an analogy, one order differential equation needs to one integral constant to determine the integration. Does this related with the rank of matrix? But wait a moment, Laplacian is the second order derivative, isn't it? OK, I still do not understand what the Laplacian matrix is. I will continue to study and find out what it is. At least, I should continue the Strang's book.

2009-10-03

C++ is hard... link considering dynamic library plugin.

This article might be interesting to someone who writes a plugin software with C/C++ using shared library (dynamic link library) mechanism. But, if you are not, I think this is totally rubbish.

When I develop a plugin, I could compile it and link it, but when I run it, I happen to see an error message, ``no such symbol''. I see the symbol in the executable via nm/dumpbin, but, it is not recognized from the plugin code at runtime.

Why this happens? Because the linker optimizes exposing symbols.

Some of the software needs symbols that are only used by its plugin. When you need to build such software, you need to ask to the linker to keep those symbols. Otherwise linker trashes un-referenced symbols. This is usually useful, otherwise my disk and memory will be filled up by copies of unused symbols.

To ask to the linker that ``please keep unused symbols since I might need them later,'' you need to specify --export-dynamic option (liker option) to the linker. If you use gcc as the linker driver, the option is -Wl,--export-dynamic.

In this way, the executable can have the symbols that are only referred by dlopen-ed code.

This method keeps all the symbols that are exported from the executable, but you can also specify the symbols you want to keep, one by one. This is effective to avoid reverse engineering a bit, since I can control which symbol I want to expose. But, this is a pain for developers and I sometimes envy the freesoftware writers that they do not need to care and manage these cumbersome problem. Sometimes I spent days only for this.

The motivation of writing this article comes from my job task. We also do this. Sometimes our customer complains our library, ``Hey, you don't give us the promised function! Where it is?'' Not only that, I made a bug in a make file, my colleague can not use our software. Sorry, Matthias.