Skip to content

Conversation

@Faye-yufan
Copy link
Contributor

@Faye-yufan Faye-yufan commented Oct 9, 2023

Overview

This PR aims to refactor the existing draw_geom function, turning some procedure and functional code into class methods. The changes improve the code scalability, particularly when we want to implement new features on the renderer side.
This PR is critical for future developments of issue #48

Changes

  • Refactored the Geom class to better adhere to OOP principles.
  • Added methods for selection and styling, making it easier to apply custom features.
  • Optimized the logic for preparing data.

TODO

  • Create group/ungroup virtual classes inherits from the basic geom class.
  • Create specific geom class inherits from the virtual classes.
  • Optimize the selected_funs, as we discussed previously.
  • Delete the old code and test complete with the new code.
  • Document function logic within code

@tdhock
Copy link
Collaborator

tdhock commented Oct 10, 2023

looks like a good start, thanks for sharing.

@Faye-yufan
Copy link
Contributor Author

Hi @tdhock , I am trying to understand the logic behind key_fun and id_fun. What are their functionality? I noticed they have been defined twice, the first time is here

var key_fun = null;
var id_fun = function(d){
return d.id;
};
if(g_info.aes.hasOwnProperty("key")){
key_fun = function(d){
return d.key;
};
}

The second time, they are part of an if-else clause, where their behavior changes when the geom type is one of the following: line, path, ribbon, or polygon.
key_fun = function(group_info){
return group_info.value;
};
id_fun = function(group_info){
var one_group = keyed_data[group_info.value];
var one_row = one_group[0];
// take key from first value in the group.
return one_row.id;
};

I've experimented with an animint viz using the key aes and attempted debugging through the web developer tools. However, I couldn't seem to trigger this function. Following the comment in the code, it appears to be related to group the data into 1 path.

// Lines, paths, polygons, and ribbons are a bit special. For
// every unique value of the group variable, we take the
// corresponding data rows and make 1 path. The tricky part is
// that to use d3 I do a data-bind of some "fake" data which are
// just group ids, which is the kv variable in the code below

Do you recall the logic behind this? Thanks.

@tdhock
Copy link
Collaborator

tdhock commented Oct 10, 2023

Hi Yufan, thanks for asking about key vs id.
The key_fun in JS is supposed to lookup the value which was specified in aes(key). key_fun is used in the context of a D3 data-bind, in JS code that is elements.data(kv, key_fun) which binds the data in kv to the svg elements, matching them according to keys from key_fun. This feature makes smooth transitions possible. (duration option in animint)

id_fun is used to assign the HTML id property using this JS code, elements.attr("id", id_fun) -- again the id_fun is supposed to lookup whatever value was specified as aes(id). This feature is used extensively in testing, where we lookup rendered HTML elements by id, and check their computed properties against expected values.

the reason why they are defined twice, is becuase there are two cases, based on whether the geom is grouped (path, ribbon, or polygon) or not (others). The first definition var id_fun =function(d){ return d.id; } is for non-grouped geoms, and the second definition key_fun = function(group_info){ return group_info.value; } is for grouped geoms. It would be great if you could please create virtual classes GroupedGeom and UngroupedGeom to better organize this logic.
It is necessary to treat them separately because for un-grouped geoms, there is one HTML element to create for every data point/csv row. But for grouped geoms, there is one HTML element to create for every aes(group) value (several data points/csv rows per group).
Does that help?
btw this should be documented somewhere, where do you think would be the best place? Maybe a wiki page about "JS internals" ?

@Faye-yufan
Copy link
Contributor Author

Thanks a lot for the detailed explanation! It is definitely helpful for me, and I love the idea of creating virtual classes for group and ungroup geoms.
As for documenting this, I can put the logic within the code as comment, and once the refactoring is complete, I can then consolidate the logic of functions into a wiki page.

@Faye-yufan
Copy link
Contributor Author

If I understand correctly, group geoms are those g_info.data_is_object == true, which includes line, path, ribbon and polygon.
Whereas ungrouped geoms are those defining styles in if clause - segment, linerange, vline, hline, text, point, tallrect, widerect, rect, boxplot.

@tdhock tdhock mentioned this pull request Nov 12, 2023
@tdhock
Copy link
Collaborator

tdhock commented Dec 14, 2023

hi @Faye-yufan when you get some time, can you please try to merge new master into this branch and resolve conflicts? There are some substantial new changes which I merged into master recently, which remove some duplicated logic in animint.js and so would be complementary to the OO/class re-organization you have been working on in this branch.

@Faye-yufan
Copy link
Contributor Author

@tdhock Thank you for modifying the logic, it helps a lot! I'm working on it now.

@Faye-yufan
Copy link
Contributor Author

Here are some notes for myself, please correct me if I'm wrong -

  1. This part of code is for process data grouping and subset selection, it also check if any '.variable' aes attribute exist(I guess this is from the original animint package before animint2?). The loop gets selected variables into an 2D array:
    g_info.subset_order.forEach(function (aes_name) {
    var selected, values;
    var new_arrays = [];
    if(0 < aes_name.indexOf(".variable")){
    selected_arrays.forEach(function(old_array){
    var some_data = chunk;
    old_array.forEach(function(value){
    if(some_data.hasOwnProperty(value)) {
    some_data = some_data[value];
    } else {
    some_data = {};
    }
    })
    values = d3.keys(some_data);
    values.forEach(function(s_name){
    var selected = Selectors[s_name].selected;
    var new_array = old_array.concat(s_name).concat(selected);
    new_arrays.push(new_array);
    })
    })
    }else{//not .variable aes:
    if(aes_name == "PANEL"){
    selected = PANEL;
    }else{
    var s_name = g_info.aes[aes_name];
    selected = Selectors[s_name].selected;
    }
    if(isArray(selected)){
    values = selected; //multiple selection.
    }else{
    values = [selected]; //single selection.
    }
    values.forEach(function(value){
    selected_arrays.forEach(function(old_array){
    var new_array = old_array.concat(value);
    new_arrays.push(new_array);
    })
    })
    }
    selected_arrays = new_arrays;
    });

    A question - Does this part ever going to be used in the code? Because the selected_array just got initialized above and there is nothing in it. If we do the loop right after that, it would not affect anything.
    selected_arrays.forEach(function(old_array){
    var some_data = chunk;
    old_array.forEach(function(value){
    if(some_data.hasOwnProperty(value)) {
    some_data = some_data[value];
    } else {
    some_data = {};
    }
    })
    values = d3.keys(some_data);
    values.forEach(function(s_name){
    var selected = Selectors[s_name].selected;
    var new_array = old_array.concat(s_name).concat(selected);
    new_arrays.push(new_array);
    })
    })


  1. This part is for getting the current one row of data, convert the data type from object into array:
    selected_arrays.forEach(function(value_array){
    var some_data = chunk;
    value_array.forEach(function(value){
    if (some_data.hasOwnProperty(value)) {
    some_data = some_data[value];
    } else {
    if(g_info.data_is_object){
    some_data = {};
    }else{
    some_data = [];
    }
    }
    });
    if(g_info.data_is_object){
    if(isArray(some_data) && some_data.length){
    data["0"] = some_data;
    }else{
    for(k in some_data){
    data[k] = some_data[k];
    }
    }
    }else{//some_data is an array.
    data = data.concat(some_data);
    }
    });

@tdhock
Copy link
Collaborator

tdhock commented Dec 29, 2023

yes the .variable suffix comes from animint (not 2) which required users to specify aes(clickSelects.variable=selector_name, clickSelects.value=selector_value) etc. In animint2 user specifies geom(clickSelects=c("selector_name"=selector_value)) but under the hood (not user visible) we still use the aes(clickSelects.variable) etc.

I believe the selected_arrays parts of the code are indeed used, for selecting the subset of data which corresponds to the current showSelected selection, to display.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants