• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

POP QUIZ: Find the bug in the following code...

Status
Not open for further replies.

aaaaa0

Member
The code is C++.

Code:
class CFoo
{
public:
    LPCWSTR GetText()
    {
        return m_sbstrText ? m_sbstrText : L"";
    }
private:
    CComBSTR            m_sbstrText;
};

int main()
{
    LPCWSTR pwszFoo = NULL;
    CFoo Foo;

    pwszFoo = Foo.GetText();

    ASSERT( pwszFoo && wcscmp( pwszFoo, L"" ) == 0 );

    return 0;
}

o_O

Edit: Clarify that the code is C++. Added an ASSERT to indicate which values should be valid for pwszFoo.

Edit: Fixed the assert.

Edit: Clarified the hints.

Edit: Removed the parameters from main to remove distractions from problem.

Hint 1: This code will crash some of the time. Sometimes it will succeed.

Hint 2: This code compiles on Visual Studio. You need to include atlbase.h in your headers.

Hint 3: This bug is quite subtle. I only saw it after staring at it for quite some time. Try stepping through the code in your debugger, line by line. Concentrate on line 6.

Hint 4: This bug is an example of how you can get yourself into trouble when you use object oriented programming without understanding what's going on under the covers.
 

fart

Savant
this is C++, not java

my guess is it's a typing error Ccocomsdfoaoiueoirehtert can't be coerced to lamswesoistrng or whatever the hell those crazy names are. my C++ is pretty rusty though. that L"" looks like an error to me too. you're either an identifier or a literal, dude, you can't be both!

oh, if lwmpadfoaiehtoihretdstring can't be coerced to something integer-like your conditional substituter will error out too (will it consider non-null to be true? damn i'm rusty)

can you tell us what the types actually are? i don't know the std libs very well
 

aaaaa0

Member
fart said:
L"" looks like an error to me too. you're either an identifier or a literal, dude, you can't be both!

Prefixing a string with an L means that the string is a unicode string.

"This is a string" tells the compiler that the string between the quotes should be defined as an ASCII string.

L"This is a string" tells the compiler that the string between the quotes should be defined as a Unicode string.

This is not relevant to the bug.

can you tell us what the types actually are? i don't know the std libs very well

CComBSTR is a standard BSTR string wrapper class. The documentation for it is here:

http://msdn.microsoft.com/library/en-us/vclib/html/_atl_CComBSTR.asp?frame=true

A BSTR is a count prefixed null-terminated Unicode string. The pointer points to the beginning of the string, but pointer-1 contains the length of the string. You don't need to care, CComBSTR takes care of that.

LPCWSTR is a standard typedef. It means "pointer to a const null terminated unicode character string".
 

fart

Savant
ok, i'm guessing m_substr is automatically unwrapped to field m_str which should be null by the default constructor, but it's not because the libraries suck or something
 

aaaaa0

Member
fart said:
ok, i'm guessing m_substr is automatically unwrapped to field m_str which should be null by the default constructor, but it's not because the libraries suck or something

m_sbstrText is correctly unwrapped to NULL for the conditional test. It's the rest of line 6 that's the problem...

Here's another hint:

(a) ? a : b

Consider if a and b are different types. What is the type of the object that's returned from that expression if (a) resolves to FALSE.
 

fart

Savant
by the MSDN cond op reference this shouldn't compile then. if the m_substr doesn't unwrap, one's a class type and the other is a null const. if it does unwrap the pointer type will be converted but L"" should still evaluate to null even if it's a BSTR pointer? C++ typing is terrible.
 

aaaaa0

Member
fart said:
by the MSDN cond op reference this shouldn't compile then. if the m_substr doesn't unwrap, one's a class type and the other is a null const. if it does unwrap the pointer type will be converted but L"" should still evaluate to null even if it's a BSTR pointer? C++ typing is terrible.

Hint: CComBSTR has multiple overloaded constructors, and the compiler is free to create temporary objects in expressions.

(a) ? a : b will always return objects with the type of a.

If (a) resolves to TRUE, then the expression will return a.
If (a) resolves to FALSE, then the expression will return b coerced to the type of a.
 

Ecrofirt

Member
well, here's my guess from my very limited C++ knowledge.

Anything we did had int_main, not int_tmain

no idea if that's it or not.
 

aaaaa0

Member
Ecrofirt said:
well, here's my guess from my very limited C++ knowledge.

Anything we did had int_main, not int_tmain

no idea if that's it or not.

tmain just means that the main function can take an argv[] of type TCHAR. It's not relevant to the problem. (And I should have removed it to get rid of a distraction, sorry.)
 

fart

Savant
hmm.. the compiler doesn't initialize a BSTR wrapper to point to the literal (stack-allocated i'm guessing) does it? unwrapping that would produce a dangling pointer and then the crashing you're talking about i guess.
 

aaaaa0

Member
fart said:
hmm.. the compiler doesn't initialize a BSTR wrapper to point to the literal (stack-allocated i'm guessing) does it? unwrapping that would produce a dangling pointer and then the crashing you're talking about i guess.

The compiler does invoke the correct BSTR overloaded constructor, passing the literal empty string as input.

The BSTR then allocates and makes a copy of the literal into its internal member. This all works just fine.

But now we're getting close to the fatal flaw...

Notice the return type of GetText()?
 

fart

Savant
it shouldn't unwrap to L"" though. it will still unwrap to a BSTR. if the constructor works properly m_str should point to the start of the unicode character string L"" (which is null)

on return the BSTR wrapper unwraps to a non-null pointer to m_str (which was null) and then the compiler instantiated object is wiped off the call stack?
 

aaaaa0

Member
cubicle47b said:
L"" is not a pointer.

Actually, it is. L"" is a constant Unicode character array of length 1, containing a single character NULL.

In C/C++, if you have "char somearray[20]" then "somearray" and "&somearray" are the same thing.
 
So the const part is the problem then. It will actually work with L"" but not when returning m_substrText. ??

It's been years since I seriously programmed in C++.

edit: Whee! Completely wrong.
 

aaaaa0

Member
on return the BSTR wrapper unwraps to a non-null pointer to m_str (which was null) and then the compiler instantiated object is wiped off the call stack?

YUP. EXACTLY. fart rocks!

Because of the ?: operator, the compiler constructs a temporary object of type CComBSTR. It uses the string overload of the constructor. CComBSTR then copies the literal into its internal member. Then the compiler notices the function doesn't return a CComBSTR... so it uses a cast operator to get back to the member and returns that. At which point the compiler constructed temporary destructs and frees the internal member of the temporary CComBSTR.

Thus resulting in the function returning a pointer to freed memory. Which results occasionally in a crash.

Subtle huh?

o_O

This is a GREAT interview question. :)
 

aaaaa0

Member
fart said:
this is why strong typing rocks

Unfortunately strong typing doesn't really help here, since everything the compiler did went through a completely valid cast.

The real problem is the combination of not noticing what the ?: operator really does, the compiler being a bit overeager in finding cast operators, and a class that returns a pointer to an internal member.

There are really only 2 solutions to this bug:

1. Understand what CComBSTR is really doing under the covers.

2. Use a language with GARBAGE COLLECTION. (Which has a whole other set of problems, but does solve this one.)
 

fart

Savant
yes, i see.

i still think strong typing would have helped here because the design of the ternary ? in C++ is too arbitrary for a type system like oh, java's (java rocks!). the compiler ran around looking for casts because it allows textually obvious type errors to compile as long as they've been provided for somewhere by someone.

it's really semantics though. you're right, C++ was strongly typed in this instance.

thanks for the exercise aaaa0, it was instructive and fun!
 
Status
Not open for further replies.
Top Bottom