-
Notifications
You must be signed in to change notification settings - Fork 345
[C++ BoundsSafety] Fix false positives when pointer argument is a function call #11046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Conversation
…ction call rdar://155952016
@patrykstefanski This is my attempt to extend the existing |
} | ||
} | ||
return {E, false}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that both trySimplifyDerefAddressof
and trySubstitute
attempt to simplify and substitute expressions. Maybe we could consider combining them into a single function?
I was thinking about something like this:
// Simplify '*&e' and/or substitute decl. We want to perform both at the same
// time, since this pattern might occur after a substitution:
// E: *x, DependentValues: {x->&y} which results in *&y
std::pair<const Expr *, bool>
trySimplifyAndSubstitute(const Expr *const E, bool AllowSubst,
const DependentValuesTy *DependentValues) const {
const Expr *X = E->IgnoreParenImpCasts();
bool HasDeref = false;
if (const auto *DerefOp = dyn_cast<UnaryOperator>(X);
DerefOp && DerefOp->getOpcode() == UO_Deref) {
HasDeref = true;
X = DerefOp->getSubExpr()->IgnoreParenImpCasts();
}
auto trySubstitute = [&](const Expr *E) -> std::pair<const Expr *, bool> {
if (AllowSubst && DependentValues) {
if (const auto *DRE = dyn_cast<DeclRefExpr>(E)) {
if (auto It = DependentValues->find(DRE->getDecl());
It != DependentValues->end())
return {It->second->IgnoreParenImpCasts(), true};
}
}
return {E, false};
};
bool Substituted = false;
std::tie(X, Substituted) = trySubstitute(X);
if (HasDeref) {
const auto *AddrOfOp = dyn_cast<UnaryOperator>(X);
if (!AddrOfOp || AddrOfOp->getOpcode() != UO_AddrOf)
return {E, false};
X = AddrOfOp->getSubExpr()->IgnoreParenImpCasts();
}
if (!Substituted)
std::tie(X, Substituted) = trySubstitute(X);
return {X, Substituted};
}
In that case, VisitDeclRefExpr
and VisitUnaryOperator
could look like this:
bool VisitDeclRefExpr(const DeclRefExpr *SelfDRE, const Expr *Other,
bool hasSelfBeenSubstituted,
bool hasOtherBeenSubstituted) {
const ValueDecl *SelfVD = SelfDRE->getDecl();
const auto [Self, Substituted] = trySimplifyAndSubstitute(
SelfDRE, !hasSelfBeenSubstituted, DependentValuesSelf);
if (Self != SelfDRE) {
// We only simplify '*&e', this couldn't happen here, since we have a DRE.
assert(Substituted);
// We substituted the expr, now recurse.
return Visit(Self, Other, true, hasOtherBeenSubstituted);
}
std::tie(Other, hasOtherBeenSubstituted) = trySimplifyAndSubstitute(
Other, !hasOtherBeenSubstituted, DependentValuesOther);
const auto *O = Other->IgnoreParenImpCasts();
if (const auto *OtherDRE = dyn_cast<DeclRefExpr>(O)) {
// Both SelfDRE and OtherDRE can be transformed from member expressions:
if (OtherDRE->getDecl() == SelfVD)
return true;
return false;
}
const auto *OtherME = dyn_cast<MemberExpr>(O);
if (MemberBase && OtherME) {
return OtherME->getMemberDecl() == SelfVD &&
Visit(OtherME->getBase(), MemberBase, hasSelfBeenSubstituted,
hasOtherBeenSubstituted);
}
return false;
}
bool VisitUnaryOperator(const UnaryOperator *SelfUO, const Expr *Other,
bool hasSelfBeenSubstituted,
bool hasOtherBeenSubstituted) {
if (SelfUO->getOpcode() != UO_Deref)
return false; // We don't support any other unary operator
const auto &[SimplifiedSelf, Substituted] = trySimplifyAndSubstitute(
SelfUO, !hasSelfBeenSubstituted, DependentValuesSelf);
if (SelfUO != SimplifiedSelf) {
return Visit(SimplifiedSelf, Other, Substituted, hasOtherBeenSubstituted);
}
std::tie(Other, hasOtherBeenSubstituted) = trySimplifyAndSubstitute(
Other, !hasOtherBeenSubstituted, DependentValuesOther);
if (const auto *OtherUO =
dyn_cast<UnaryOperator>(Other->IgnoreParenImpCasts())) {
if (SelfUO->getOpcode() == OtherUO->getOpcode())
return Visit(SelfUO->getSubExpr(), OtherUO->getSubExpr(),
hasSelfBeenSubstituted, hasOtherBeenSubstituted);
}
return false;
}
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this would work! I'm fine with keeping this traverse-based comparison until we run into more challenges (if there would be any).
On the other hand, I made an experimental change: #11148 for potentially more intuitive substitution and comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there anything wrong with my version of trySimplifyAndSubstitute
that you decided to implement this differently?
// Handle `PtrArgNoImp` has the form of `(char *) p` and `p` is a sized_by | ||
// pointer. This will be of pattern 5 after stripping the cast: | ||
if (const auto *CastE = dyn_cast<CastExpr>(PtrArgNoImp); CastE && isSizedBy) | ||
PtrArgNoImp = CastE->getSubExpr(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have tests for such casts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is part of the refactoring I plan to do after this patch. I forgot to clean it up for now.
We have three functions doing similar job:
static bool isPtrBufferSafe(const Expr *Ptr, const Expr *Size, |
static bool isSafeSpanTwoParamConstruct(const CXXConstructExpr &Node, |
and
static bool isHardcodedCountedByPointerArgumentSafe( |
I plan to merge them into one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is causing some issues like the pattern of (p, strlen(p))
is only supported by span construction but not __sized_by parameters.
} | ||
} | ||
return {E, false}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there anything wrong with my version of trySimplifyAndSubstitute
that you decided to implement this differently?
hasBeenSubstituted = true; | ||
return UO->getSubExpr(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also avoid duplicating this pattern:
auto I = DependentValues->find(DRE->getDecl());
if (I != DependentValues->end())
...
We have this 2x in this function + one more in VisitDeclRefExpr()
.
bool hasOtherBeenSubstituted) { | ||
if (!hasOtherBeenSubstituted) | ||
std::tie(Other, hasOtherBeenSubstituted) = | ||
trySubstituteAndSimplify(Other, DependentValuesOther); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: A matter of taste, but if we pass !hasOtherBeenSubstituted
to trySubstituteAndSimplify
, we could avoid guarding this with if (!hasOtherBeenSubstituted)
in every place.
const DependentValuesTy | ||
*DependentValues, // Deref may need subsitution | ||
bool &hasBeenSubstituted) -> const Expr * { | ||
auto *Deref = dyn_cast<UnaryOperator>(E->IgnoreParenImpCasts()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto
|
||
auto I = DependentValues->find(DRE->getDecl()); | ||
if (auto *DRE = dyn_cast<DeclRefExpr>(E->IgnoreParenImpCasts()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto
// | ||
// This function will be repeatedly called on the "Other" Expr. Because the | ||
// kind of "Other" stays unknown during the traversal. | ||
const std::pair<const Expr *, bool> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is returned by value, we could remove const
.
// kind of "Other" stays unknown during the traversal. | ||
const std::pair<const Expr *, bool> | ||
trySubstituteAndSimplify(const Expr *E, | ||
const DependentValuesTy *DependentValues) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function could be const
though: trySubstituteAndSimplify(...) const {
.
bool hasOtherBeenSubstituted) { | ||
if (!hasOtherBeenSubstituted) | ||
std::tie(Other, hasOtherBeenSubstituted) = | ||
trySubstituteAndSimplify(Other, DependentValuesOther); | ||
if (SelfUO->getOpcode() != UO_Deref) | ||
return false; // We don't support any other unary operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could move
if (SelfUO->getOpcode() != UO_Deref)
return false; // We don't support any other unary operator
before
if (!hasOtherBeenSubstituted)
std::tie(Other, hasOtherBeenSubstituted) =
trySubstituteAndSimplify(Other, DependentValuesOther);
In this case, if opcode is not UO_Deref
, we could return early without doing substitution and hashmap lookup.
This PR attempts to fix false positives in such an example:
The analysis needs to compare the size of
f(len)
, which is specified by__counted_by(n)
, with the expected count of the first argument ofcb
, which is specified by__counted_by(count)
. The comparison interprets the two comparands at two "call contexts" respectively: the countn
needs to be interpreted at the callf(len)
with a mapping{n -> len}
and the countcount
needs to be interpreted at the callcb(f(len), len)
with a mapping{p -> f(len), count -> len}
.The existing compare algorithm is extended from assuming only one comparand needs a substitution map to assuming both comparands need substitution maps.
rdar://155952016