DeyjaScript 9 : Arrays and Generics

posted in Jemgine
Published October 02, 2011
Advertisement
For arrays in this language, I had two requirements. First, they had to be dynamically sized, and second, they had to be type-safe. But, when I want about trying to actually implement them, I discovered that my current implementation of the type system could handle only one or the other. I could create arrays, but like in C++, the array itself could have no members. Dynamic sizing requires member functions on the array for adding and removing elements, and there is no way to add these members to the array without implementing some sort of generic type. The second option was to create a dynamically sized array that wasn't type-safe. This didn't require any special syntax since it looks like an ordinary type (called 'array'), but it did require every type in the language to have a common base. For types defined in scripts, that was simple. Their base defaults to 'object'. For primitive types, it was a bit more complicated. Instead of being a C# primitive boxed into an object, they are a C# primitive boxed into an object boxed into a ScriptObject. It is nasty, but in the long run, worth it. For either of these array implementation options, the other behavior can be built on top using generics. I decided to provide a dynamically-sized array of anything, as it should be useful for many data structures I might want to represent in the scripts.

The array isn't technically part of the language. It's one of the system types that are always available, but the script engine uses the public interface to bind them. Any client code could bind a type that behaves exactly like it.


var arrayType = addSystemType("array", objectType);
addSystemMemberFunction(arrayType, "construct", "void", (parms) =>
{
parms[0].members[0] = new List();
return null;
});
addSystemMemberFunction(arrayType, "add", "void", (parms) =>
{
(parms[1].members[0] as List).Add(parms[0]);
return null;
}, "object");
addSystemMemberFunction(arrayType, "__indexGet", "object", (parms) =>
{
return (parms[1].members[0] as List)[(parms[0].members[0] as int?).Value];
}, "int");
addSystemMemberFunction(arrayType, "__indexSet", "object", (parms) =>
{
var index = (parms[1].members[0] as int?).Value;
var list = (parms[2].members[0] as List);
var obj = parms[0];
list[index] = obj;
return obj;
}, "object", "int");


There will also have to be some methods for removal, for retrieving the count, etc, but I haven't written them. Those named '__indexGet' and '__indexSet' are special methods called by operator[]. Any type can support operator[] by implementing these functions. Function names that begin with double underscores are from this point on reserved for implementation details like this. Also notice that when a function returns void, it doesn't need to create a special ScriptObject to represent void.

I mentioned generics, and I really wanted type-safe arrays, so it was time to implement them. First, I added generic parameters to the grammar for typenames. So Foo is a valid typename, and so is Foo. Typenames became a tree of nested typenames, and, frankly, became way more complex than I expected. Parsing is not so difficult. I add generic parameters to object declarations too. Now I need to resolve these new typenames to actual types. TypenameNode does the heavy lifting.


internal static Scope findType(bool searchUp, ScopeStack context, Scope currentScope, TypenameToken of)
{
Scope nextScope = null;
if (of.hasGenericParameters)
{
var templateName = of.singleLevelName;
var instancedType = currentScope.findLocalType(templateName, searchUp);
if (instancedType != null) nextScope = instancedType;
else
{
var declarationScope = currentScope.findLocalType(of.identifier, searchUp);
var genericParameters = new List();
foreach (var parameter in of.genericParameters)
genericParameters.Add(findType(true, context, currentScope, parameter));
var Engine = context.globalScope.tag as ScriptEngine;
var Object = declarationScope.tag as ObjectDeclarationNode;
nextScope = Engine.EnqueueGenericTypeForCompilation(Object, genericParameters);
}
}
else
nextScope = currentScope.findLocalType(of.identifier, searchUp);
if (of.next == null || nextScope == null) return nextScope;

return findType(false, context, nextScope, of.next);
}


Ow! TypenameToken is a linked list of tokens, where a token is an identifier plus an optional list of generic arguments. This function does one thing of particular note; it looks for an instance of the generic type that takes the same parameter types. If it finds one, great, it just returns that one (This little detail will allow me to implement typed arrays of primitive types far more efficiently, assuming I do something to prevent a script from deriving from a primitive type). If not, it tells the script engine to compile it. The script engine will delay compilation until the current module is finished, but it will immediately create a scope for the generic type, register all it's members, and resolve all it's typenames. It has to do this right away so that whatever called TypenameNode.findType can finish it's compilation.

internal Scope EnqueueGenericTypeForCompilation(ObjectDeclarationNode decl, List genericParameters)
{
var into = new Scope();
var newDecl = decl.Clone() as ObjectDeclarationNode;
genericTypesAwaitingCompilation.Enqueue(new pendingGeneric
{
decl = newDecl,
genericParameters = genericParameters,
into = into
});

if (genericParameters.Count != decl.genericParamters.Count)
throw new CompileError("Wrong number of generic arguments.");
var newTypeName = newDecl.localScope.thisType + "<";
for (int i = 0; i < genericParameters.Count; ++i)
{
if (i != 0) newTypeName += ",";
newTypeName += genericParameters.fullTypeName;
}
newTypeName += ">";

into.thisType = newTypeName;
into.tag = newDecl;
newDecl.registerMembers(decl.localScope.ownerScope, into);

for (int i = 0; i < genericParameters.Count; ++i)
into.genericParameters.Type = genericParameters;

newDecl.resolveTypes(new ScopeStack(globalScope, into), into);
into.finalizeScope();
into.parent = null;
return into;
}


It is necessary to clone the entire abstract syntax tree for the generic type. This is because each node keeps some state that's set in the 'gather information' compile pass, and used in the 'emit bytecode' pass. Since we are just doing the first pass on the generic type, and saving the second pass for later, we need a copy of that information for each instantiation. If you tried to use TypedArray and TypedArray in the same module, whichever one was mentioned second would trample the state of the first. Wait, why can't we just compile it completely right away? Because a generic argument might be a type in the module that's using the generic type, and it's first pass isn't finished yet. Actually cloning the syntax tree is straightforward. I don't need to worry about the extra information that Irony puts into AstNodes, because it won't be used.

public object Clone()
{
var result = Activator.CreateInstance(this.GetType());
if (result is CompilableNode) (result as CompilableNode).AsString = AsString;
foreach (var child in this.ChildNodes)
{
var newInstance = (child as ICloneable).Clone();
if (child is CompilableNode) (child as CompilableNode)._cloneImpl(newInstance);
(result as AstNode).ChildNodes.Add(newInstance as AstNode);
}
return result;
}


A few types use _cloneImpl to do some extra state transfer, and a few override clone entirely. TypenameNode gets away with public object Clone() { return this; }!

After cloning the AST, I figure out the name of this instance so that later attempts to use this same generic with these same parameters won't repeat this effort. Finally, I get down to compiling. RegisterMembers, in addition to adding all the members and methods, populates a list of generic parameter names. I use this list to fill in the types with the generic parameters specified. Later, when asked for a type, Scope will look to see if the name asked for exists in this list. If it does, it won't search it's member types, it will just return whatever type is in the list. Now I resolve the generic's types. There's a terrible bug here I'll leave as an exercise to the reader, but here's a hint : Where is the module in that scope stack? I return the scope for TypenameNode to use as the generic type instance.

Finally, I can actually write TypedArray.

TypedArray {
array data;
void construct() { data = new array(); }
T __indexGet(int i) { return (data as T); }
T __indexSet(T t, int i) { return ((data = t) as T); }
void add(T t) { data.add(t); }
//etc
}


It exists in the 'Core' module, so you can use it like 'Core:TypedArray'.

Some closing thoughts
*This is probably the only scripting language to support generics before it supports for loops. But I'm glad for that, because I would have had to modify them anyway.
*When resolving the types of a function declaration, I had to reach into the function implementation and look for any generic types so they can be instanced before the function is compiled.
*You could technically use the type 'TypedArray' without any generic arguments. But none of it's methods will work, as it will never, and in fact can't, be compiled.
*These 'generics' are more like C++ templates than C# generics. However, the compilation model won't support a test-substitution without major changes so SFINAE will probably never be an option.
*The error messages produced by the compiler are entirely useless.
0 likes 0 comments

Comments

Nobody has left a comment. You can be the first!
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Profile
Author
Advertisement
Advertisement