Skip to content

Adding basic starlark exec function. #22167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jul 22, 2025
Merged

Conversation

fengttt
Copy link
Contributor

@fengttt fengttt commented Jul 15, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #22166 #22165 #9026

What this PR does / why we need it:

Adding a star lark stored procedure language.


PR Type

Enhancement


Description

  • Add Starlark language support for stored procedures

  • Implement mo.sql, mo.jq, mo.quote built-in functions

  • Add CREATE OR REPLACE PROCEDURE syntax

  • Update stored procedure schema with language column


Changes diagram

flowchart LR
  A["SQL Parser"] --> B["CREATE PROCEDURE"]
  B --> C["Language Detection"]
  C --> D["SQL Interpreter"]
  C --> E["Starlark Interpreter"]
  E --> F["mo.sql()"]
  E --> G["mo.jq()"]
  E --> H["mo.quote()"]
  F --> I["Database Operations"]
  G --> J["JSON Processing"]
  H --> K["SQL Escaping"]
Loading

Changes walkthrough 📝

Relevant files
Enhancement
12 files
authenticate.go
Update stored procedure creation and execution logic         
+88/-35 
starlark_interpreter.go
Add Starlark interpreter for stored procedures                     
+239/-0 
procedure.go
Add language support to procedure AST                                       
+40/-18 
func_builtin_starlark.go
Implement starlark and try_starlark functions                       
+153/-0 
upgrade.go
Add database upgrade handler for v2.3.0                                   
+104/-0 
json.go
Add JQ query processing utilities                                               
+97/-0   
util.go
Update expression evaluation for FeSession interface         
+11/-9   
session.go
Add user-defined variable management methods                         
+20/-19 
back_exec.go
Support user variables in background execution                     
+22/-2   
list_builtIn.go
Register starlark and try_starlark functions                         
+40/-0   
function_id.go
Add function IDs for Starlark functions                                   
+9/-1     
types.go
Add GetColumnCount method and SetUserDefinedVar                   
+3/-0     
Tests
4 files
upgrade_test.go
Add unit tests for upgrade functionality                                 
+85/-0   
authenticate_test.go
Update test calls with background parameter                           
+3/-3     
json_test.go
Add tests for JSON and JQ utilities                                           
+35/-0   
types_mock.go
Add GetColumnCount mock method                                                     
+7/-0     
Configuration changes
3 files
cluster_upgrade_list.go
Define cluster upgrade entries for procedure schema           
+49/-0   
predefined.go
Update stored procedure DDL schema                                             
+4/-2     
tenant_upgrade_list.go
Define empty tenant upgrade entries                                           
+19/-0   
Miscellaneous
1 files
log.go
Add logging utility for upgrade process                                   
+24/-0   
Additional files
17 files
go.mod +1/-0     
go.sum +2/-0     
types.go +2/-0     
back_self_handle.go +7/-0     
mysql_cmd_executor.go +2/-2     
plsql_interpreter.go +1/-1     
self_handle.go +1/-1     
stmt_kind.go +4/-0     
mysql_sql.go +8425/-8421
mysql_sql.y +12/-7   
build_show_util.go +2/-1     
function_id_test.go +4/-1     
procedure.result +2/-2     
starlark.result +81/-0   
starlark.sql +83/-0   
starlark_sql.result +68/-0   
starlark_sql.sql +56/-0   

Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • Copy link

    You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🎫 Ticket compliance analysis 🔶

    9026 - Partially compliant

    Compliant requirements:

    • Fix stored procedure metadata type from FUNCTION to PROCEDURE (line 8597 sets type to "PROCEDURE")
    • Add owner column to mo_stored_procedure table (line 1097 adds 'lang' column and schema changes in predefined.go)

    Non-compliant requirements:

    • Implement show procedure status command to return proper format like MySQL

    Requires further human verification:

    • Verify that the owner column implementation follows mo_user_defined_function pattern as specified
    • Test show procedure status functionality
    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 PR contains tests
    🔒 Security concerns

    SQL injection:
    The code uses string formatting to build SQL queries without proper parameterization (lines 1108-1124 in authenticate.go). The updateMoStoredProcedureFormat and similar format strings directly interpolate values into SQL statements.

    Code execution: The Starlark interpreter allows execution of arbitrary code with database access through the mo.sql function, which could be exploited if user input is not properly validated.

    Privilege escalation: The stored procedure system allows users to execute code with potentially elevated privileges through the database connection.

    ⚡ Recommended focus areas for review

    SQL Injection

    String formatting in SQL queries without proper escaping could lead to SQL injection vulnerabilities. The format strings use direct string interpolation.

    	database_collation) values ('%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s','%s', '%s');`
    
    updateMoStoredProcedureFormat = `update mo_catalog.mo_stored_procedure
    		set 
    		    args = '%s',
    			lang = '%s',
    		    body = '%s',
    			db = '%s',
    		    definer = '%s',
    		    modified_time = '%s',
    		    type = '%s',
    		    security_type = '%s',
    		    comment = '%s',
    		    character_set_client = '%s',
    		    collation_connection = '%s',
    		    database_collation = '%s'
    		where proc_id = %d;`
    Security Risk

    Starlark interpreter allows execution of arbitrary code with database access through mo.sql function, which could be exploited for unauthorized database operations.

    func (si *starlarkInterpreter) moSql(thread *starlark.Thread, b *starlark.Builtin, args starlark.Tuple, kwargs []starlark.Tuple) (starlark.Value, error) {
    	var sql string
    	if err := starlark.UnpackPositionalArgs("sql", args, kwargs, 1, &sql); err != nil {
    		return nil, err
    	}
    
    	err := si.interp.bh.Exec(si.interp.ctx, sql)
    	if err != nil {
    		return starlark.String(err.Error()), nil
    	}
    
    	erArray, err := getResultSet(si.interp.ctx, si.interp.bh)
    	if err != nil {
    		return starlark.String(err.Error()), nil
    	}
    
    	if len(erArray) == 0 {
    		return starlark.None, nil
    	}
    
    	rows := make([]starlark.Value, len(erArray))
    	for i, er := range erArray {
    		rowsi := make(starlark.Tuple, er.GetColumnCount())
    		for j := range rowsi {
    			v, err := er.GetString(si.interp.ctx, uint64(i), uint64(j))
    			if err != nil {
    				return nil, err
    			}
    			rowsi[j] = starlark.String(v)
    		}
    		rows[i] = rowsi
    	}
    	return starlark.NewList(rows), nil
    Data Structure Change

    Change from map to slice for argList could break existing functionality and affects argument handling logic throughout the codebase.

    argList := make([]tree.ProcedureArgForMarshal, len(cp.Args))
    for i := 0; i < len(cp.Args); i++ {
    	curName := cp.Args[i].GetName(fmtctx)
    	fmtctx.Reset()
    
    	if curName == "mo" || strings.HasPrefix(curName, "mo.") || strings.HasPrefix(curName, "out_") {
    		return moerr.NewInvalidInput(ctx, "mo, mo.*, out_* are reserved and cannot be used as a procedure argument name")
    	}
    
    	argList[i] = tree.ProcedureArgForMarshal{
    		ArgName:   curName,
    		Name:      cp.Args[i].(*tree.ProcedureArgDecl).Name,
    		Type:      cp.Args[i].(*tree.ProcedureArgDecl).Type,
    		InOutType: cp.Args[i].(*tree.ProcedureArgDecl).InOutType,

    Copy link

    You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Fix incorrect indexing in result processing

    The loop uses the wrong index when accessing result data. The outer loop
    variable i represents the result array index, but er.GetString(uint64(i),
    uint64(j)) should use row index 0 since er is already the specific result.

    pkg/frontend/starlark_interpreter.go [106-117]

     rows := make([]starlark.Value, len(erArray))
     for i, er := range erArray {
    -    rowsi := make(starlark.Tuple, er.GetColumnCount())
    -    for j := range rowsi {
    -        v, err := er.GetString(si.interp.ctx, uint64(i), uint64(j))
    -        if err != nil {
    -            return nil, err
    +    rowCount := er.GetRowCount()
    +    rowsi := make([]starlark.Value, rowCount)
    +    for row := uint64(0); row < rowCount; row++ {
    +        cols := make(starlark.Tuple, er.GetColumnCount())
    +        for j := range cols {
    +            v, err := er.GetString(si.interp.ctx, row, uint64(j))
    +            if err != nil {
    +                return nil, err
    +            }
    +            cols[j] = starlark.String(v)
             }
    -        rowsi[j] = starlark.String(v)
    +        rowsi[row] = cols
         }
    -    rows[i] = rowsi
    +    rows[i] = starlark.NewList(rowsi)
     }
    • Apply / Chat
    Suggestion importance[1-10]: 9

    __

    Why: The suggestion correctly identifies a bug where the wrong index is used to access result set rows, which would lead to incorrect data retrieval or out-of-bounds errors.

    High
    Return proper errors instead of strings

    The function returns error messages as strings instead of proper errors, which
    can mask critical SQL execution failures. This makes it difficult to distinguish
    between successful results and error conditions in Starlark code.

    pkg/frontend/starlark_interpreter.go [86-95]

     func (si *starlarkInterpreter) moSql(thread *starlark.Thread, b *starlark.Builtin, args starlark.Tuple, kwargs []starlark.Tuple) (starlark.Value, error) {
         var sql string
         if err := starlark.UnpackPositionalArgs("sql", args, kwargs, 1, &sql); err != nil {
             return nil, err
         }
     
         err := si.interp.bh.Exec(si.interp.ctx, sql)
         if err != nil {
    -        return starlark.String(err.Error()), nil
    +        return nil, err
         }
         ...
     }

    [To ensure code accuracy, apply this suggestion manually]

    Suggestion importance[1-10]: 7

    __

    Why: The suggestion correctly points out that returning an error message as a string is poor practice; propagating a proper error is more robust and idiomatic for the Starlark runtime.

    Medium
    • More

    @fengttt fengttt force-pushed the fengttt-starlark branch from 5866d5e to bd70922 Compare July 22, 2025 17:15
    @fengttt fengttt merged commit 53bb34d into matrixorigin:main Jul 22, 2025
    16 of 17 checks passed
    @fengttt fengttt deleted the fengttt-starlark branch July 22, 2025 18:15
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    10 participants