JVM启动流程

虚拟机的启动入口位于jdk/src/share/bin/java.c的JLI_Launch函数，整个流程分为如下几个步骤：

配置JVM装载环境
解析虚拟机参数
设置线程栈大小
执行JavaMain方法

JLI_Launch函数的定义，在入口点的参数有很多个，其中包括当前的完整版本名称、简短版本名称、运行参数、程序名称、启动器名称等：

int
JLI_Launch(int argc, char ** argv,              /* main argc, argc */
        int jargc, const char** jargv,          /* java args */
        int appclassc, const char** appclassv,  /* app classpath */
        const char* fullversion,                /* full version defined */
        const char* dotversion,                 /* dot version defined */
        const char* pname,                      /* program name */
        const char* lname,                      /* launcher name */
        jboolean javaargs,                      /* JAVA_ARGS */
        jboolean cpwildcard,                    /* classpath wildcard */
        jboolean javaw,                         /* windows-only javaw */
        jint     ergo_class                     /* ergnomics policy */
);

首先会进行一些初始化操作以及Debug信息打印配置等：

InitLauncher(javaw);
DumpState();
if (JLI_IsTraceLauncher()) {
    int i;
    printf("Command line args:\n");
    for (i = 0; i < argc ; i++) {
        printf("argv[%d] = %s\n", i, argv[i]);
    }
    AddOption("-Dsun.java.launcher.diag=true", NULL);
}

接着选择合适的JRE版本：

/*
 * Make sure the specified version of the JRE is running.
 *
 * There are three things to note about the SelectVersion() routine:
 *  1) If the version running isn't correct, this routine doesn't
 *     return (either the correct version has been exec'd or an error
 *     was issued).
 *  2) Argc and Argv in this scope are *not* altered by this routine.
 *     It is the responsibility of subsequent code to ignore the
 *     arguments handled by this routine.
 *  3) As a side-effect, the variable "main_class" is guaranteed to
 *     be set (if it should ever be set).  This isn't exactly the
 *     poster child for structured programming, but it is a small
 *     price to pay for not processing a jar file operand twice.
 *     (Note: This side effect has been disabled.  See comment on
 *     bugid 5030265 below.)
 */
SelectVersion(argc, argv, &main_class);

接着创建JVM执行环境，例如需要确定数据模型，是32位还是64位，以及jvm本身的一些配置在jvm.cfg文件中读取和解析：

CreateExecutionEnvironment(&argc, &argv,
                               jrepath, sizeof(jrepath),
                               jvmpath, sizeof(jvmpath),
                               jvmcfg,  sizeof(jvmcfg));

此函数只在头文件中定义，具体的实现是根据不同平台而定的。接着会动态加载jvm.so这个共享库，并把jvm.so中的相关函数导出并且初始化，而启动JVM的函数也在其中：

if (!LoadJavaVM(jvmpath, &ifn)) {
    return(6);
}

然后是对JVM进行初始化：

return JVMInit(&ifn, threadStackSize, argc, argv, mode, what, ret);

初始化后在新的线程中执行JavaMain函数：

/* Initialize the virtual machine */
start = CounterGet();
if (!InitializeJVM(&vm, &env, &ifn)) {
    JLI_ReportErrorMessage(JVM_ERROR1);
    exit(1);
}

第一步初始化虚拟机，如果报错直接退出。

接着加载主类，因为主类Java程序的入口点：

/*
 * Get the application's main class.
 *
 * See bugid 5030265.  The Main-Class name has already been parsed
 * from the manifest, but not parsed properly for UTF-8 support.
 * Hence the code here ignores the value previously extracted and
 * uses the pre-existing code to reextract the value.  This is
 * possibly an end of release cycle expedient.  However, it has
 * also been discovered that passing some character sets through
 * the environment has "strange" behavior on some variants of
 * Windows.  Hence, maybe the manifest parsing code local to the
 * launcher should never be enhanced.
 *
 * Hence, future work should either:
 *     1)   Correct the local parsing code and verify that the
 *          Main-Class attribute gets properly passed through
 *          all environments,
 *     2)   Remove the vestages of maintaining main_class through
 *          the environment (and remove these comments).
 *
 * This method also correctly handles launching existing JavaFX
 * applications that may or may not have a Main-Class manifest entry.
 */
mainClass = LoadMainClass(env, mode, what);

某些没有主方法的Java程序比如JavaFX应用，会获取ApplicationMainClass：

/*
 * In some cases when launching an application that needs a helper, e.g., a
 * JavaFX application with no main method, the mainClass will not be the
 * applications own main class but rather a helper class. To keep things
 * consistent in the UI we need to track and report the application main class.
 */
appClass = GetApplicationClass(env);

初始化完成：

/*
 * PostJVMInit uses the class name as the application name for GUI purposes,
 * for example, on OSX this sets the application name in the menu bar for
 * both SWT and JavaFX. So we'll pass the actual application class here
 * instead of mainClass as that may be a launcher or helper class instead
 * of the application class.
 */
PostJVMInit(env, appClass, vm);

获取主类中的主方法:

/*
 * The LoadMainClass not only loads the main class, it will also ensure
 * that the main method's signature is correct, therefore further checking
 * is not required. The main method is invoked here so that extraneous java
 * stacks are not in the application stack trace.
 */
mainID = (*env)->GetStaticMethodID(env, mainClass, "main", "([Ljava/lang/String;)V");

在字节码中void main(String[] args)表示为([Ljava/lang/String;)V。接着调用主方法：

/* Invoke main method. */
(*env)->CallStaticVoidMethod(env, mainClass, mainID, mainArgs);

调用后，Java程序开始运行，直到走到主方法的最后一行返回：

/*
 * The launcher's exit code (in the absence of calls to
 * System.exit) will be non-zero if main threw an exception.
 */
ret = (*env)->ExceptionOccurred(env) == NULL ? 0 : 1;
LEAVE();

在最后LEAVE函数中会销毁JVM。

内存管理

在C/C++中，经常通过使用申请内存的方式来创建对象或是存放某些数据，但是这样也带来了一些额外的问题，要在何时释放这些内存，怎么才能使得内存的使用最高效？

比如通过C语言动态申请内存，并存放数据：

#include <stdlib.h>
#include <stdio.h>

int main(){
    //动态申请4个int大小的内存空间
    int* memory = malloc(sizeof(int) * 4);
    //修改第一个int空间的值
    memory[0] = 10;
    //修改第二个int空间的值
    memory[1] = 2;
    //遍历内存区域中所有的值
    for (int i = 0; i < 4; i++){
        printf("%d ", memory[i]);
    }
    //释放指针所指向的内存区域
    free(memory);
    //最后将指针赋值为NULL
    memory = NULL;
}

在Java中，这种操作实际上是不允许的，Java只支持直接使用基本数据类型和对象类型，至于内存到底如何分配，并不是由我们来处理，而是JVM帮助我们进行控制，这样可以节省很多内存上的工作，虽然带来了很大的便利，但是，一旦出现内存问题，我们就无法像C/C++那样对所管理的内存进行合理地处理，因为所有的内存操作都是由JVM在进行，只有了解了JVM的内存管理机制，才能够在出现内存相关问题时找到解决方案。

内存区域划分

JVM对内存的管理采用的是分区治理，不同的内存区域有着各自的职责所在。

内存区域一共分为5个区域：方法区，堆，虚拟机栈，本地方法栈，程序计数器。其中方法区和堆是所有线程共享的区域，随着虚拟机的创建而创建，虚拟机的结束而销毁，而虚拟机栈、本地方法栈、程序计数器都是线程之间相互隔离的，每个线程都有一个自己的区域，并且线程启动时会自动创建，结束之后会自动销毁。内存划分完成之后，JVM执行引擎和本地库接口，也就是Java程序开始运行之后就会根据分区合理地使用对应区域的内存了。

程序计数器

JVM中的程序计数器是当前线程所执行字节码的指令地址指示器。字节码解释器在工作时会根据程序计数器的值执行指令，并在执行完毕后更新程序计数器的值，以指向下一条即将执行的指令。

在Java的多线程环境中，每个线程都有独立的程序计数器。JVM通过线程调度机制（如操作系统的时间片轮转算法）切换线程的执行。当一个线程被切换出去时，它的程序计数器会记录当前执行的字节码指令地址；当该线程再次被调度执行时，JVM会根据程序计数器的值继续执行后续指令。

程序计数器只需要记录当前线程执行的字节码指令地址，因此它占用的内存非常小，通常是JVM内存结构中占用最少的部分。

虚拟机栈

虚拟机栈就是一个非常关键的部分，看名字就知道它是一个栈结构，每个方法被执行的时候，Java虚拟机都会同步创建一个栈帧（其实就是栈里面的一个元素），栈帧中包括了当前方法的一些信息，比如局部变量表、操作数栈、动态链接、方法出口等。

其中局部变量表就是方法中的局部变量，局部变量表在class文件中就已经定义好了
操作数栈就是之前字节码执行时使用到的栈结构
每个栈帧保存了一个可以指向当前方法所在类的运行时常量池，当前方法中如果需要调用其他方法的时候，能够从运行时常量池中找到对应的符号引用，然后将符号引用转换为直接引用，然后就能直接调用对应方法，这就是动态链接
方法出口，也就是方法该如何结束，是抛出异常还是正常返回。

举个例子：

public class Main {
    public static void main(String[] args) {
        int res = a();
        System.out.println(res);
    }

    public static int a(){
        return b();
    }

    public static int b(){
        return c();
    }

    public static int c(){
        int a = 10;
        int b = 20;
        return a + b;
    }
}

当我们的主方法执行后，会依次执行三个方法a() -> b() -> c() -> 返回

首先是main方法调用：

局部变量表：args（main 方法的参数）
操作数栈：空
动态链接：指向 main 方法的引用
返回地址：无（main 方法是程序的入口）

然后main方法调用a()：

局部变量表：空（a() 无参数）
操作数栈：空
动态链接：指向 a() 方法的引用
返回地址：main 方法中调用 a() 的下一条指令
栈状态，由顶到底：a(),main()

…

到c()调用后：

栈帧

局部变量表：a = 10，b = 20
操作数栈：存储 a + b 的计算结果
动态链接：指向 c() 方法的引用
返回地址：b() 方法中调用 c() 的下一条指令
栈状态，由顶到底：c(),b(),a(),main()

然后按照栈弹出顺序返回，最后输出结果。

本地方法栈

本地方法栈与虚拟机栈作用差不多，用于支持Native方法（非 Java 方法，通常用 C/C++ 实现）的执行

方法区

方法区是整个Java应用程序共享的区域，它用于存储所有的类信息、常量、静态变量、动态编译缓存等数据，可以大致分为两个部分，一个是类信息表，一个是运行时常量池

主要用于存储以下内容：

类信息：
- 类的全限定名
- 类的直接父类的全限定名
- 类的修饰符（如 public、abstract、final 等）
- 类的接口信息
字段信息：
- 字段的名称、类型、修饰符（如 public、private、static 等）
方法信息：
- 方法的名称、返回类型、参数列表、修饰符（如 public、static 等）
- 方法的字节码（Bytecode）、操作数栈、局部变量表等
运行时常量池（Runtime Constant Pool）：
- 存储编译期生成的字面量（如字符串、数字常量）和符号引用（如类、方法、字段的引用）
- 运行时常量池是方法区的一部分，每个类或接口都有自己的运行时常量池
静态变量（Static Variables）：
- 类的静态变量（static 修饰的变量）存储在方法区中
类加载器引用：
- 存储加载该类的类加载器的引用
方法代码：
- 存储方法的字节码和相关的元数据

堆

堆是整个Java应用程序共享的区域，也是整个虚拟机最大的一块内存空间，此区域的职责就是存放和管理对象和数组，后续提到的垃圾回收机制也是主要作用于这一部分内存区域

爆堆与爆栈

在Java程序运行时，内存容量不可能是无限制的，当我们的对象创建过多或是数组容量过大时，就会导致我们的堆内存不足以存放更多新的对象或是数组，这时就会出现错误，比如：

import java.util.ArrayList;
import java.util.List;

public class HeapOverflowExample {
    public static void main(String[] args) {
        List<Object> list = new ArrayList<>();
        while (true) {
            list.add(new Object()); // 不断创建对象，耗尽堆内存
        }
    }
}

运行后JVM会抛出 java.lang.OutOfMemoryError: Java heap space 错误，也就是堆内存溢出错误。

当程序在栈内存中分配的栈帧过多，导致栈内存耗尽时，就会发生爆栈。栈内存用于存储方法调用的栈帧，包括局部变量、操作数栈、方法返回地址等，比如：

public class StackOverflowExample {
    public static void main(String[] args) {
        infiniteRecursion(); // 无限递归，导致爆栈
    }

    private static void infiniteRecursion() {
        infiniteRecursion(); // 递归调用自身
    }
}

运行后JVM会抛出java.lang.StackOverflowError错误。

特性	爆堆（Heap Overflow）	爆栈（Stack Overflow）
内存区域	堆内存	栈内存
存储内容	对象实例、数组	方法调用的栈帧（局部变量、操作数栈、方法返回地址等）
错误类型	`java.lang.OutOfMemoryError: Java heap space`	`java.lang.StackOverflowError`
常见原因	内存泄漏、对象过多、堆内存设置过小	递归调用过深、栈内存设置过小
解决方法	检查内存泄漏、增加堆内存、优化代码	检查递归终止条件、增加栈内存、优化代码

申请堆外内存

除了堆内存可以存放对象数据以外，我们也可以申请堆外内存（直接内存），也就是不受JVM管控的内存区域，这部分区域的内存需要我们自行去申请和释放，实际上本质就是JVM通过C/C++调用malloc函数申请的内存。直接内存会受到本机最大内存的限制，有可能抛出OutOfMemoryError异常

这里需要提到一个堆外内存操作类：Unsafe，这个类不让new，也没有直接获取方式：

public final class Unsafe {

    private static native void registerNatives();
    static {
        registerNatives();
        sun.reflect.Reflection.registerMethodsToFilter(Unsafe.class, "getUnsafe");
    }

    private Unsafe() {}

    private static final Unsafe theUnsafe = new Unsafe();
  
    @CallerSensitive
    public static Unsafe getUnsafe() {
        Class<?> caller = Reflection.getCallerClass();
        if (!VM.isSystemDomainLoader(caller.getClassLoader()))
            throw new SecurityException("Unsafe");   //不是JDK的类，不让用。
        return theUnsafe;
    }

通过反射拿到Unsafe类之后，就可以开始申请堆外内存了，比如现在要申请一个int大小的内存空间，并在此空间中存放一个int类型的数据：

public static void main(String[] args) throws IllegalAccessException {
    // 通过反射拿到Unsafe类
    Field unsafeField = Unsafe.class.getDeclaredFields()[0];
    unsafeField.setAccessible(true);
    Unsafe unsafe = (Unsafe) unsafeField.get(null);

    //申请4字节大小的内存空间，并得到对应位置的地址
    long address = unsafe.allocateMemory(4);
    //在对应的地址上设定int的值
    unsafe.putInt(address, 6666666);
    //获取对应地址上的Int型数值
    System.out.println(unsafe.getInt(address));	// 输出666666
    //释放申请到的内容
    unsafe.freeMemory(address);

    //由于内存已经释放，这时数据就没了
    System.out.println(unsafe.getInt(address));	// 野指针，输出随机数据
}