It's debatable what the filter should do when a non-UTF8 encoding is selected (e.g. base64). Should it be applied against the representation of the term? (e.g. the filter "MSw" will select the term "MSwxMTk=" in base64 when this coding is selected?) This seems to be fraught with potential confusions and problems to me.
If we restrict the filter to be UTF-8 (which seems more sensible to me) should we disable it for all other encodings?