Skip to content

Verification

Donizyo edited this page Apr 10, 2016 · 12 revisions

Format Checking

When a prospective class file is loaded by the Java Virtual Machine, the Java Virtual Machine first ensures that the file has the basic format of a class file. This process is known as format checking. The checks are as follows:

  • The first four bytes must contain the right magic number.
    In class file compiled with Oracle's javac, the very magic number is 0xCAFEBABE.
  • All recognized attributes must be of the proper length.
Attribute Name Attribute Length
ConstantValue attribute_length == 2
Code attribute_length == 12 + code_length + 8 * exception_table_length + length_sum(attributes)
StackMapTable attribute_length == 2 + length_sum(entries)
Exceptions attribute_length == 2 + number_of_exceptions * 2
InnerClasses attribute_length == 2 + 8 * number_of_classes
EnclosingMethod attribute_length == 4
Synthetic attribute_length == 0
Signature attribute_length == 2
SourceFile attribute_length == 2
SourceDebugExtension attribute_length == any
LineNumberTable attribute_length == 2 + 4 * line_number_table_length
LocalVariableTable attribute_length == 2 + 10 * local_variable_table_length
LocalVariableTypeTable attribute_length == 2 + 10 * local_variable_type_table_length
Deprecated attribute_length == 0
RuntimeVisibleAnnotations,
RuntimeInvisibleAnnotations
attribute_length == 2 + length_sum(annotations)
RuntimeVisibleParameterAnnotations,
RuntimeInvisibleParameterAnnotations
attribute_length == 1 + length_sum(parameter_annotations)
RuntimeVisibleTypeAnnotations,
RuntimeInvisibleTypeAnnotations
attribute_length == 2 + length_sum(annotations)
AnnotationDefault attribute_length == length_sum(default_value)
BoostrapMethods attribute_length == 2 + 6 * num_bootstrap_methods
MethodParameters attribute_length = 1 + 4 * parameters_count
  • The class file must not be truncated or have extra bytes at the end.

  • The constant pool must satisfy the constraints

Constant Pool Entry Item Requirements Description
CONSTANT_Class_info name_indexCONSTANT_Utf8_info Because arrays are objects, the opcodes anewarray and multianewarray - but not the opcode new - can reference array "classes" via CONSTANT_Class_info structures in the constant_pool table. For such array classes, the name of the class is the descriptor of the array type.
CONSTANT_Fieldref_info,
CONSTANT_Methodref_info,
CONSTANT_InterfaceMethodref_info
class_indexCONSTANT_Class_info
&&
name_and_type_indexCONSTANT_NameAndType_info
The class_index item of a CONSTANT_Methodref_info structure must be a class type, not an interface type.
The class_index item of a CONSTANT_InterfaceMethodref_info structure must be an interface type.
The class_index item of a CONSTANT_Fieldref_info structure may be either a class type or an interface type.
In a CONSTANT_Fieldref_info, the indicated descriptor must be a field descriptor. Otherwise, the indicated descriptor must be a method descriptor.
If the name of the method of a CONSTANT_Methodref_info structure begins with a '<' ('\u003c'), then the name must be the special name <init>. The return type of such a method must be void.
CONSTANT_String_info string_indexCONSTANT_Utf8_info
CONSTANT_Integer_info,
CONSTANT_Float_info
CONSTANT_Long_info,
CONSTANT_Double_info
CONSTANT_NameAndType_info name_indexCONSTANT_Utf8_info
&&
descriptor_indexCONSTANT_Utf8_info
CONSTANT_Utf8_info The bytes array contains the bytes of the string.
No bytes may have the value (byte) 0.
No byte may lie in the range (byte) 0xf0 to (byte) 0xff.
CONSTANT_MethodHandle_info 1 ≤ reference_kind ≤ 9 If reference_kind == 1 (REF_getField), 2 (REF_getStatic), 3 (REF_putField), or 4 (REF_putStatic), then reference_indexCONSTANT_Fieldref_info;
If reference_kind == 5 (REF_invokeVirtual) or 8 (REF_newInvokeSpecial), then reference_indexCONSTANT_Methodref_info
If reference_kind == 6 (REF_invokeStatic) or 7 (REF_invokeSpecial), then
if the class file version number is less then 52.0, then reference_indexCONSTANT_Methodref_info
else reference_indexCONSTANT_Methodref_info||CONSTANT_InterfaceMethodref_info.
If reference_kind == 9 (REF_invokeInterface), then reference_indexCONSTANT_InterfaceMethodref_info


If reference_kind == 5 (REF_invokeVirtual), 6 (REF_invokeStatic), 7 (REF_invokeSpecial), or 9 (REF_invokeInterface), then the name of the method represented by a CONSTANT_Methodref_info structure or a CONSTANT_InterfaceMethodref_info structure must not be <init> or <clinit>.
If reference_kind == 8 (REF_newInvokeSpecial), the name of the method represented by a CONSTANT_Methodref_info structure must be <init>.

Constraints on Java Virtual Machine Code

The code for a method, instance initialization method, or class or interface initialization method is stored in the code array of the Code attribute of a method_info structure of a class file. This section describes the constraints associated with the contents of the Code_attribute structure.

Static Constraints

The static constraints on the code in a class file specify how Java Virtual Machine instructions must be laid out in the code array and what the operands of individual instructions must be.

The static constraints on the instructions in the code array are as follows:

  • Only instances of the instructions documented in Instructions may appear in the code array. Instances of instructions using the reserved opcodes or any opcodes not documented in this specification must not appear in the code array.

    If the class file version number is 51.0 or above, then neither the jsr opcode nor the jsr_w opcode may appear in the code array.

  • The opcode of the first instruction in the code array begins at index 0.

  • For each instruction in the code array except the last, the index of the opcode of the next instruction equals the index of the opcode of the current instruction plus the length of that instruction, including all its opearands.

    The wide instruction is treated like any other instruction for these purposes; the opcode specifying the opearation that a wide instruction is to modify is treated as one of the operands of that wide instruction. The opcode must never be directly reachable by the computation.

  • The last byte of the last instruction in the code array must be the byte at index code_length - 1.

The static constraints on the operands of instructions in the code array are as follows:

  • The target of each jump and branch instruction (jsr, jsr_w, goto, goto_w, ifeq, ifne, ifle, iflt, ifge, ifgt, ifnull, ifnonnull, if_icmpeq, if_icmpne, if_icmple, if_icmplt, if_icmpge, if_icmpgt, if_acmpeq, if_acmpne) must be the opcode of an instruction within this method.

    The target of a jump or branch instruction must never be the opcode used to specify the operation to be modified by a wide instruction; a jump or branch target may be the wide instruction itself.

  • Each target, including the default, of each tableswitch instruction must be the opcode of an instruction within this method.

    Each tableswitch instruction must have a number of entries in its jump table that is consistent with the value of its low and high jump table opearnds, and its low value must be less than or equal to its high value.

    No target of a tableswitch instruction may be the opcode use to specify the operation to be modified by a wide instruction; a tableswitch target may be a wide instruction itself.

  • Each target, including the default, of each lookupswitch instruction must be the opcode of and instruction within this method.

    Each lookupswitch instruction must have a number of match-offset pairs that is consistent with the value of its nparis operand. The match-offset pairs must be sorted in increasing numerical order by signed match value.

    No target of a lookupswitch instruction may be the opcode used to specify the operation to be modified by a wide instruction; a lookupswitch target may be a wide instruction itself.

  • The operand of each ldc instruction and each ldc_w instruction must be a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type:

    • CONSTANT_Integer, CONSTANT_Float, or CONSTANT_String, if the class file version number is less than 49.0
    • CONSTANT_Integer, CONSTANT_Float, CONSTANT_String, or CONSTANT_Class, if the class file version number is 49.0 or 50.0.
    • CONSTANT_Integer, CONSTANT_Float, CONSTANT_String, CONSTANT_Class, CONSTANT_MethodType, or CONSTANT_MethodHandle, if the class file version number is 51.0 or above.
  • The operands of each ldc2_w instruction must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Long or CONSTANT_Double.

    The subsequent constant pool index must also be a valid index into the constant pool, and the constant pool entry at that index must not be used.

  • The operands of each getfield, putfield, getstatic, and putstatic instruction must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must of type CONSTANT_Fieldref.

  • The indexbyte operands of each invokevirtual instruction must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Methodref.

  • The indexbyte operands of each invokespecial and invokestatic instruction must represent a valid index into the constant_pool table. If the class file version number is less than 52.0, the constant pool entry referenced by that index must be of type CONSTANT_Methodref; if the class file version number is 52.0 or above, the constant pool entry referenced by that index must be of type CONSTANT_Methodref or CONSTANT_InterfaceMethodref.

  • The indexbyte operands of each invokeinterface instruction must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_InterfaceMethodref.

    The value of the count operand of each invokeinterface instruction must reflect the number of local variables necessary to store the arguments to be passed to the interface method, as implied by the descriptor of the CONSTANT_NameAndType_info structure referenced by the CONSTANT_InterfaceMethodref constant pool entry.

    The fourth operand byte of each invokeinterface instruction must have the value zero.

  • The indexbyte operands of each invokedynamic instruction must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_InvokeDynamic.

    The third and fourth operand bytes of each invokedynamic instruction must have the value zero.

  • Only the invokespecial instruction is allowed to invoke an instance initialization method.

    No other method whose name begins with the character '<' ('\u003c') may be called by the method invocation instructions. In particular, the class or interface initialization method specially named <clinit> is never called explicitly from Java Virtual Machine instructions, but only implicitly by the Java Virtual Machine itself.

  • The operands of each instanceof, checkcast, new and anewarray instruction, and the indexbyte operands of each multianewarray instruction, must represent a valid index into the constant_pool table. The constant pool entry referenced by that index must be of type CONSTANT_Class.

  • No new instruction may referenced a constant pool entry of type CONSTANT_Class that represents an array type. The new instruction cannot be used to create an array.

  • No anewarray instruction may be used to create an array of more than 255 dimensions.

  • A multianewarray instruction must be used only to create an array of a type that has at least as many dimensions as the value of its dimensions operand. That is, while a multianewarray instruction is not required to create all of the dimensions of the array type referenced by its indexbyte operands, it must not attempt to create more dimensions than are in the array type.

    The dimensions operand of each multianewarray instruction must not be zero.

  • The atype operand of each newarray instruction must take one of the values T_BOOLEAN (4), T_CHAR (5), T_FLOAT (6), T_DOUBLE (7), T_BYTE (8), T_SHORT (9), T_INIT (10), or T_LONG (11).

  • The index operand of each iload, fload, aload, istore, fstore, astore, iinc, and ret instruction must be a non-negative integer no greater than max_locals - 1.

    The implicit index of each iload_<n>, fload_<n>, aload_<n>, istore_<n>, fstore_<n>, and astore_<n> instruction must be no greater than max_locals - 1.

  • The index operand of each lload, dload, lstore, and dstore instruction must be no greater than max_locals - 2.

    The implicit index of each lload_<n>, dload_<n>, lstore_<n>, and dstore_<n> instruction must be no greater than max_locals - 2.

  • The indexbyte operands of each wide instruction modifying an iload, fload, aload, istore, fsotre, astore, iinc or ret instruction must represent a non-negative integer no greater than max_locals - 1.

    The indexbyte opearnds of each wide instruction modifying an lload, dload, lstore, or dstore instruction must represent a non-negative integer no greater than max_locals - 2.

Structural Constraints

The structural constraints on the code array specify constraints on relationships between Java Virtual Machine instructions. The structural constraints are as follows:

  • Each instruction must only be executed with the appropriate type and number of arguments in the operand stack and local variable array, regardless of the execution path that leads to its invocation.

    An instruction operating on values of type int is also permitted to operate on values of type boolean, byte, char, and short.

  • If an instruction can be executed along several different execution paths, the operand stack must have the same depth prior to the execution of the instruction, regardless of the path taken.

  • At no point during execution can the operand stack grow to a depth greater than that implied by the max_stack item.

  • At no point during execution can more values be popped from the operand stack than it contains.

  • At no point during execution can the order of the local variable pair holding a value of type long or double be reversed or the pair split up. At no point can the variable of such a pair be operated on individually.

  • No local variable (or local variable pair, in the case of a value of type long or double) can be accessed before it is assigned a value.

  • Each invokespecial instruction must name an instance initialization method, a method in the current class or interface, a method in a superclass of the current class, a method in a direct superinterface of the current class or interface, or a method in Object.

    When an instance initialization method is invoked, an uninitialized class instance must be in an appropriate position on the operand stack. An instance initialization method must never be invoked on an initialized class instance.

    If an invokespecial instruction names an instance initialization method and the target reference on the operand stack is an uninitialized class instance for the current class, then invokespecial must name an instance initialization method from the current class or its direct superclass.

    If an invokespecial instruction names an instance initialization method and the target reference on the operand stack is a class instance created by an earlier new instruction, then invokespecial must name an instance initialization method from the class of that class instance.

    If an invokespecial instruction names a method which is not an instance initialization method, then the type of the target reference on the operand stack must be assignment compatible with the current class (JLS §5.2).

  • Each instance initialization method, except for the instance initialization method derived from the constructor of class Object, must call either another instance initialization method of this or an instance initialization method of its direct superclass super before its instance memebers are accessed.

    However, instance fields of this that are declared in the current class may be assigned before calling any instance initialization method.

  • When any instance method is invoked or when any instance variable is accessed, the class instance that contains the instance method or instance variable must already be initialized.

  • If there is an uninitialized class instance in a local variable in code protected by an exception handler, then

    • if the handler is inside an <init> method, the handler must throw an exception or loop forever
    • if the handler is not inside an <init> method, the uninitialized class instance must remain uninitialized.
  • There must never be an uninitialized class instance on the operand stack or in a local variable when a jsr or jsr_w instruction is executed.

  • The type of every class instance that is the target of a method invocation instruction must be assignment compatible with the class or interface type specified in the instruction (JLS §5.2).

  • The type of the arguments to each method invocation must be method invocation compatible with the method descriptor (JLS §5.3 §4.3.3).

  • Each return instruction must match its method's return type:

    • If the method returns a boolean, byte, char, short, or int, only the ireturn instruction may be used.
    • If the method returns a float, long, or double, only an freturn, lreturn, or dreturn instruction, respectively, may be used.
    • If the method returns a reference type, only areturn instruction may be used, and the type of the returned value must be assignment compatible with the return descriptor of the method (JLS §5.2, §4.3.3).
    • All instance initialization methods, class or interface initialization methods, and methods declared to return void must use only the return isntruction.
  • The type of every class instance accessed by a getfield instruction or modified by a putfield instruction must be assignment compatible with the class type specified in the instruction (JLS §5.2).

  • The type of every value stored by a putfield or putstatic instruction must be compatible with the descriptor with the descriptor of the field of the class instance or class being stored into:

    • If the descriptor type is boolean, byte, char, short, or int, then the value must be an int.
    • If the descriptor type is float, long, or double, then the value must be a float, long, or double, respectively.
    • If the descriptor type is a reference type, then the value must be of a type that is assignment compatible with the descriptor type (JLS §5.2).
  • The type of every value stored into an array by an aastore instruction must be a reference type.

    The component type of the array being stored into by the aastore instruction must also be a reference type.

  • Each athrow instruction must throw only values that are instances of class Throwable or subclasses or Throwable.

    Each class mentioned a catch_type item of the exception_table array of the method's Code_attribute structure must be Throwable or a subclass of Throwable.

  • If getfield or putfield is used to access a protected field declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed must be the same as or a subclass of the current class.

    If invokevirtual or invokespeicial is used to access a protected method declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed must be the same as or a subclass of the current class.

  • Execution never falls off the bottom of the code array.

  • No return address (a value of type returnAddress) may be loaded from a local variable.

  • The instruction following each jsr or jsr_w instruction may be returned to only by a single ret instruction.

  • No jsr or jsr_w instruction that is returned to may be used to recursively call a subroutine if that subroutine is already present in the subroutine call chain. (Subroutines can be nested when using try-finally constructs from within a finally clause.)

  • Each instance of type returnAddress can be returned to at most once.

    If a ret instruction returns to a point in the subroutine call chain above the ret instruction corresponding to a given instance of type returnAddress, then that instance can never be used as a return address.

Clone this wiki locally