Caucho Technology
  • resin 4.0
  • performance


    Resin 10,000

    Resin 4.0 has been load tested with 10,000 simultaneous keepalive connections and 10,000 simulteneous async comet connections.

    In a HTTP server, most of the browsers are idle, waiting for the user to finish reading and click on the next page. The browser and Resin can keep the TCP socket connected for improved efficiency, while taking minimal resources until the next page is needed. Because of Resin's use of non-blocking i/o like the Linux epoll() call, it can maintain thousands of keepalive connections with little overhead.

    Resin v. Apache

    Resin is an excellent web server for both static pages and for load-balancing to multiple backend application servers. For a web-tier server, static page performance and load-balance performance are both important, as is the ability to proxy cache pages. The following benchmarks give a quick comparison between Resin and Apache as web-tier servers: both are very close in performance, although Resin is slightly faster than Apache in most of these cases.

    These results were benchmarked with Resin 3.1.0 and Apache 2.2.3 with a pair of Debian Linux machines using a 1G ethernet. Resin's proxy cache was disabled to match Apache's default no proxy-cache configuration, but no other special configurations were applied to either server.

    The first set of benchmarks compare static page serving using a 1k page to simulate small image files and a 64k page to simulate normal web pages. For the small pages, Resin was about 5% faster than Apache, and for large pages, the two are essentially identical.

    Static Pages
    RESIN (OPS PER SECOND)APACHE (OPS PER SECOND)
    1K HTML (1 CLIENT, 1 KEEPALIVE)3,537 OPS3,287 OPS
    1K HTML (10 CLIENT, 4 KEEPALIVE)19,568 OPS16,466 OPS
    64K HTML (1 CLIENT, 1 KEEPALIVE)874 OPS859 OPS
    64K HTML (10 CLIENT, 4 KEEPALIVE)1,800 OPS1,804 OPS

    The second set of benchmarks compare load balancing of JSP pages to a backend Resin server. Both a 1k page and a 64k page were simulated. For comparison, the performance of Resin serving the same JSP page as a standalone HTTP server is also provided. Again, for small pages Resin is about 5-10% faster than Apache and is essentially identical for larger pages.

    Load Balancing for Resin JSP
    RESIN LOAD BALANCINGAPACHE LOAD BALANCINGRESIN STANDALONE
    1K JSP (1 CLIENT, 1 KEEPALIVE)2,269 OPS1,989 OPS3,903 OPS
    1K JSP (10 CLIENT, 4 KEEPALIVE)14,119 OPS10,351 OPS26,620 OPS
    64K JSP (1 CLIENT, 1 KEEPALIVE)579 OPS604 OPS826 OPS
    64K JSP (10 CLIENT, 4 KEEPALIVE)1,668 OPS1,661 OPS1,799 OPS

    Caveats

    As always, no artificial benchmark can replace measuring your own application with your own configuration and load. In most cases, other considerations like the application performance and database performance will dominate the performance (although proxy caching can make slow applications run as fast as static pages.) These numbers in particular are a trivial tests with a simple load. They do measure the maximum performance of the web server, so they are valuable information, but they are very different from a benchmark of a complete application.

    Warnings aside, these results do indicate that many sites should seriously consider using Resin as their web-tier load-balancing server. (After benchmarking your own application, of course.)

    Heap size

    The allocation of memory for the JVM is specified using -X options when starting Resin (the exact options may depend upon the JVM that you are using, the examples here are for the Sun JVM).

    JVM OPTION PASSED TO RESINMEANING
    -Xmsinitial java heap size
    -Xmxmaximum java heap size
    -Xmnthe size of the heap for the young generation
    Example: resin.xml startup with heap memory options
    <resin xmlns="http://caucho.com/ns/resin">
    <cluster id="">
    
      <server id="" address="127.0.0.1" port="6800">
        <jvm-arg>-Xmx500M</jvm-arg>
        <jvm-arg>-Xms500M</jvm-arg>
        <jvm-arg>-Xmn100M</jvm-arg>
    
        <http port="80"/>
      </server>
    
      ...
    
    </cluster>
    </resin>
    

    It is good practice with server-side Java applications like Resin to set the minimum -Xms and maximum -Xmx heap sizes to the same value.

    For efficient garbage collection, the -Xmn value should be lower than the -Xmx value.

    Heap size does not determine the amount of memory your process uses

    If you monitor your java process with an OS tool like top or taskmanager, you may see the amount of memory you use exceed the amount you have specified for -Xmx. -Xmx limits the java heap size, java will allocate memory for other things, including a stack for each thread. It is not unusual for the total memory consumption of the VM to exceed the value of -Xmx.

    Garbage collection

    (thanks to Rob Lockstone for his comments)

    There are essentially two GC threads running. One is a very lightweight thread which does "little" collections primarily on the Eden (a.k.a. Young) generation of the heap. The other is the Full GC thread which traverses the entire heap when there is not enough memory left to allocate space for objects which get promoted from the Eden to the older generation(s).

    If there is a memory leak or inadequate heap allocated, eventually the older generation will start to run out of room causing the Full GC thread to run (nearly) continuously. Since this process "stops the world", Resin won't be able to respond to requests and they'll start to back up.

    The amount allocated for the Eden generation is the value specified with -Xmn. The amount allocated for the older generation is the value of -Xmx minus the -Xmn. Generally, you don't want the Eden to be too big or it will take too long for the GC to look through it for space that can be reclaimed.

    See also:

    Stack size

    Each thread in the VM get's a stack. The stack size will limit the number of threads that you can have, too big of a stack size and you will run out of memory as each thread is allocated more memory than it needs.

    The Resin startup scripts (resin.exe on Windows, resin.sh on Unix) will set the stack size to 2048k, unless it is specified explicity. 2048k is an appropriate value for most situations.

    Stack configuration
    <JVM-ARG>MEANING
    -Xssthe stack size for each thread

    -Xss determines the size of the stack: -Xss1024k. If the stack space is too small, eventually you will see an exception java.lang.StackOverflowError.

    Some people have reported that it is necessary to change stack size settings at the OS level for Linux. A call to ulimit may be necessary, and is usually done with a command in /etc/profile:

    Limit thread stack size on Linux
    unix> ulimit -s 2048
    

    Monitoring the JVM

    JDK 5 includes a number of tools that are useful for monitoring the JVM. Documentation for these tools is available from the Sun website. For JDK's prior to 5, Sun provides the jvmstat tools.

    The most useful tool is jconsole. Details on using jconsole are provided in the Administration section of the Resin documentation.

    Example: jconsole configuration
    <resin xmlns="http://caucho.com/ns/resin">
    <cluster id="">
    
      <server-default>
        <jvm-arg>-Dcom.sun.management.jmxremote</jvm-arg>
      </server-default>
    
      <server id="" address="127.0.0.1" port="6800"/>
    
      ...
    </cluster>  
    </resin>
    
    Example: jconsole launching
     ... in another shell window ... 
    
    win> jconsole.exe
    unix> jconsole
    
    Choose Resin's JVM from the "Local" list.

    jps and jstack are also useful, providing a quick command line method for obtaining stack traces of all current threads. Details on obtaining and interpreting stack traces is in the Troubleshooting section of the Resin documentation.

    jps and jstack
    # jps
    12903 Jps
    20087 Resin
    # jstack 20087
    Attaching to process ID 20087, please wait...
    Debugger attached successfully.
    Client compiler detected.
    JVM version is 1.5.0-beta2-b51
    Thread 12691: (state = BLOCKED)
     - java.lang.Object.wait(long) (Compiled frame; information may be imprecise)
     - com.caucho.util.ThreadPool.runTasks() @bci=111, line=474 (Compiled frame)
     - com.caucho.util.ThreadPool.run() @bci=85, line=423 (Interpreted frame)
     - java.lang.Thread.run() @bci=11, line=595 (Interpreted frame)
    
    
    Thread 12689: (state = BLOCKED)
     - java.lang.Object.wait(long) (Compiled frame; information may be imprecise)
     - com.caucho.util.ThreadPool.runTasks() @bci=111, line=474 (Compiled frame)
     - com.caucho.util.ThreadPool.run() @bci=85, line=423 (Interpreted frame)
     - java.lang.Thread.run() @bci=11, line=595 (Interpreted frame)
    
    ...
    
    

    How many concurrent users can a Resin server handle?

    This is not a question that can be answered in a general way. It is very dependent on the particular application that Resin is used for. Factors such as database usage, how the session object is used, the use of server side caching, and application architecture in general have a significant effect on the capabilities of a website.

    The best (and only practical) way to answer this question is to perform some benchmarking tests for your particular application on a server similar to the one that will host the website. The freely available httperf tool, as well as various others, are useful for this purpose.

    When using testing tools, 500 "concurrent threads" does not mean the same thing as "500 concurrent users". A typical user is not constantly making requests to the server. Typical usage involves a request for a page (with possible subsequent requests for images), and then a period of inactivity as the user reads or watches the content that has been downloaded.

    The ratio of number of users to number of threads again depends on the application involved. For example, it may be that the ratio for an application is 50:1, meaning that 2500 users will use at maximum 250 threads on the server.

    Ideally, application benchmarks use "user scenario" scripts. The script imitates what a typical user wil do, including pauses between requests. This kind of script is useful for providing an accurate picture of web server usage.

    The primary configuration item in Resin for handling a greater load is thread-max. The default in resin.xml can be adjusted upwards to handle increased load, the limit is determined by the underlying operating system.

    If anticipated load overruns a Resin server, either with CPU usage or with encountering OS thread limitations, clustering can be used to add another server to share the load.

    How does Resin use JNI?

    The JNI code is compiled on the various Unix systems when the ./configure; make; make install step is performed during installation. Windows has precompiled dlls.

    Resin uses JNI in certain critical performance areas, such as low level socket connections and file operations. JNI is also used to interface with the OpenSSL libraries.

    A significant benefit in particular is in Resin's ability to handle keepalive's. With JNI, Resin does not need a thread for each keepalive connection. The low-level poll() (or select() if poll() is not available) functions are used. The end result is the possibility of many more keepalive's than if a thread was needed for each keepalive.

    The fallback if JNI is not available is to use the JDK equivalents of the faster JNI calls. Also, OpenSSL is only available through JNI.

    Resin indicates that JNI is being used with a log message at startup:

    Loaded Socket JNI library.

    If JNI is not available, the log message is:

    Socket JNI library is not available.

    General

    What are the best things to tune for better performance?

    The main configuration item is the dependency-check-interval especially for Windows. For deployment, you should set it something high like 60s or larger.

    You can also change the cache-mapping values, especially for stuff like *.gif files that don't change. Higher values mean that the browsers won't need to go back to the server.

    Other than that, most of the default configuration values are pretty good, so you normally won't need to touch them.

    The most important performance tweak you can make is to set Expires or better Last-Modified and/or ETag values on your servlet/JSP output. If the servlet/JSP output only changes every 15 minutes, as for a news page, then caching it can be a big performance win.

    Of course, for stuff like shopping carts and stuff that's personalized, that won't help. But for many sites, the most heavily hit pages can be cached.

    Is Apache faster than Resin Standalone?

    For small files, Resin is about 10-20% faster. For large files (1M), they're essentially identical. (It's possible that the very latest Apache has improved performance.)

    For JSP and Servlets, Resin standalone is certainly faster than Resin/Apache. Because of the extra overhead of the Resin/Apache connection, the Resin/Apache configuration is necessarily slower than Resin standalone.

    It's only static files where Apache could be faster. Well, there's an exception for SSL. It's conceivable that Apache/Resin with SSL would be faster that Resin with SSL.

    What is the performance loss with a Servlet or JSP comparted to a static file?

    With Resin standalone, JSP files are essentially as fast as static files (as long as you don't actually do any processing. :-)

    If Resin is behind another web server, like IIS or Apache, there is a performance decrease with JSP and Servlet files, which comes from the overhead needed for the communication between the other web server and Resin.

    Caching

    What gets cached when a servlet does a forward?

    The only thing that matters is the HTTP headers. So if you telnet to the server, you should be able to see whether the headers are properly set or not.

    In the case of a forward, you should be able to just set the headers without needing to modify the JSP itself.

    One thing to be aware of: the caching is based on the original URL. So if your forwarding servlet varies it's output based on some request headers (like User-Agent), it needs to set the Vary header.

    <cache-mapping> is a related but somewhat separate issue, and I think we haven't explained it properly.

    <cache-mapping> only works on cacheable responses which have not set the Expires header. If you're missing the Expires header, <cache-mapping> will set it for you.

    Cacheable means:

    1. either ETag or Last-Modified must be set in the response (ETag is better). The servlet will normally set that value.
    2. no cache-control is set in the response headers
    3. no Vary tag is set (Resin doesn't completely implement Vary.)

    So your servlet still needs to do some work. <cache-mapping> isn't all that you need. The reason that <cache-mapping> works with normal files is that Resin's FileServlet sets the Last-Modified and ETag headers, but does not set the Expires header.

    What if while the cache is being filled, another request comes?

    Resin 'fills a cache' the first time a request comes in. If another request comes in and Resin has not finished filling the cache, the second request will be treated as uncachable. This means that until the cache is filled, requests will miss the cache and get serviced directly.

    This is also what happens when the cache expires. The first request to come in after the expiry time invalidates it, and while it is being filled the other requests pass through to the resource being cached.

    This behaviour may be changed in Resin 3.0, updates are available here.

    Resin Threads

    Resin will automatically allocate and free threads as the load requires. Since the threads are pooled, Resin can reuse old threads without the performance penalty of creating and destroying the threads. When the load drops, Resin will slowly decrease the number of threads in the pool until is matches the load.

    Most users can set thread-max to something large (200 or greater) and then forget about the threading. Some ISPs dedicate a JVM per user and have many JVMs on the same machine. In that case, it may make sense to reduce the thread-max to throttle the requests.

    Since each servlet request gets its own thread, thread-max determines the maximum number of concurrent users. So if you have a peak of 100 users with slow modems downloading a large file, you'll need a thread-max of at least 100. The number of concurrent users is unrelated to the number of active sessions. Unless the user is actively downloading, he doesn't need a thread (except for "keepalives").

    Keepalives

    Keepalives make HTTP and srun requests more efficient. Connecting to a TCP server is relatively expensive. The client and server need to send several packets back and forth to establish the connection before the first data can go through. HTTP/1.1 introduced a protocol to keep the connection open for more requests. The srun protocol between Resin and the web server plugin also uses keepalives. By keeping the connection open for following requests, Resin can improve performance.

    resin.xml for thread-keepalive
    <resin xmlns="http://caucho.com/ns/resin">
      <cluster id="app-tier">
    
        <server-default>
          <thread-max>250</thread-max>
    
          <keepalive-max>500</keepalive-max>
          <keepalive-timeout>120s</keepalive-timeout>
        </server-default>
        ...
      </cluster>
    </resin>
    

    Timeouts

    Requests and keepalive connections can only be idle for a limited time before Resin closes them. Each connection has a read timeout, request-timeout. If the client doesn't send a request within the timeout, Resin will close the TCP socket. The timeout prevents idle clients from hogging Resin resources.

    <resin xmlns="http://caucho.com/ns/resin">
      <cluster id="app-tier">
    
        <server-default>
          <thread-max>250</thread-max>
    
          <socket-timeout>30s</socket-timeout>
    
          <http port="8080"/>
        </server-default>
    
        ...
      </cluster>
    </resin>
    
    <resin xmlns="http://caucho.com/ns/resin">
      <cluster id="app-tier">
    
        <server-default>
          <thread-max>250</thread-max>
    
          <load-balance-idle-time>20s</load-balance-idle-time>
    
          <socket-timeout>30s</socket-timeout>
    
          <http port="8080"/>
        </server-default>
    
         <server id="app-a" address="192.168.2.1" port="6802"/>
    
         ...
       <cluster>
     <resin>
    

    In general, the socket-timeout and keepalives are less important for Resin standalone configurations than Apache/IIS/srun configurations. Very heavy traffic sites may want to reduce the timeout for Resin standalone.

    Since socket-timeout will close srun connections, its setting needs to take into consideration the <load-balance-idle-time setting for mod_caucho or isapi_srun. load-balance-idle-time is the time the plugin will keep a connection open. socket-timeout must always be larger than load-balance-idle-time, otherwise the plugin will try to reuse a closed socket.

    Plugin keepalives (mod_caucho/isapi_srun)

    The web server plugin, mod_caucho, needs configuration for its keepalive handling because requests are handled differently in the web server. Until the web server sends a request to Resin, it can't tell if Resin has closed the other end of the socket. If the JVM has restarted or if closed the socket because of socket-timeout, mod_caucho will not know about the closed socket. So mod_caucho needs to know how long to consider a connection reusable before closing it. load-balance-idle-time tells the plugin how long it should consider a socket usable.

    Because the plugin isn't signalled when Resin closes the socket, the socket will remain half-closed until the next web server request. A netstat will show that as a bunch of sockets in the FIN_WAIT_2 state. With Apache, there doesn't appear to be a good way around this. If these become a problem, you can increase socket-timeout and load-balance-idle-time so the JVM won't close the keepalive connections as fast.

    unix> netstat
    ...
    localhost.32823      localhost.6802       32768      0 32768      0 CLOSE_WAIT
    localhost.6802       localhost.32823      32768      0 32768      0 FIN_WAIT_2
    localhost.32824      localhost.6802       32768      0 32768      0 CLOSE_WAIT
    localhost.6802       localhost.32824      32768      0 32768      0 FIN_WAIT_2
    ...
    

    TCP limits (TIME_WAIT)

    A client and a server that open a large number of TCP connections can run into operating system/TCP limits. If mod_caucho isn't configured properly, it can use too many connections to Resin. When the limit is reached, mod_caucho will report "can't connect" errors until a timeout is reached. Load testing or benchmarking can run into the same limits, causing apparent connection failures even though the Resin process is running fine.

    The TCP limit is the TIME_WAIT timeout. When the TCP socket closes, the side starting the close puts the socket into the TIME_WAIT state. A netstat will short the sockets in the TIME_WAIT state. The following shows an example of the TIME_WAIT sockets generated while benchmarking. Each client connection has a unique ephemeral port and the server always uses its public port:

    Typical Benchmarking Netstat
    unix> netstat
    ...
    tcp   0   0 localhost:25033  localhost:8080  TIME_WAIT   
    tcp   0   0 localhost:25032  localhost:8080  TIME_WAIT   
    tcp   0   0 localhost:25031  localhost:8080  TIME_WAIT   
    tcp   0   0 localhost:25030  localhost:8080  TIME_WAIT   
    tcp   0   0 localhost:25029  localhost:8080  TIME_WAIT   
    tcp   0   0 localhost:25028  localhost:8080  TIME_WAIT
    ...
    

    The socket will remain in the TIME_WAIT state for a system-dependent time, generally 120 seconds, but usually configurable. Since there are less than 32k ephemeral socket available to the client, the client will eventually run out and start seeing connection failures. On some operating systems, including RedHat Linux, the default limit is only 4k sockets. The full 32k sockets with a 120 second timeout limits the number of connections to about 250 connections per second.

    If mod_caucho or isapi_srun are misconfigured, they can use too many connections and run into the TIME_WAIT limits. Using keepalives effectively avoids this problem. Since keepalive connections are reused, they won't go into the TIME_WAIT state until they're finally closed. A site can maximize the keepalives by setting thread-keepalive large and setting live-time and request-timeout to large values. thread-keepalive limits the maximum number of keepalive connections. live-time and request-timeout will configure how long the connection will be reused.

    Configuration for a medium-loaded Apache
    <resin xmlns="http://caucho.com/ns/resin">
      <cluster id="app-tier">
    
        <server-default>
          <thread-max>250</thread-max>
    
          <keepalive-max>250</keepalive-max>
          <keepalive-timeout>120s</keepalive-timeout>
    
          <load-balance-idle-time>100s</load-balance-idle-time>
        </server-default>
    
        <server id="app-a" address="192.168.2.1"/>
    
        ...
      </cluster>
    </resin>
    

    socket-timeout must always be larger than load-balance-idle-time. In addition, keepalive-max should be larger than the maximum number of Apache processes.


    Copyright © 1998-2009 Caucho Technology, Inc. All rights reserved.
    Resin ® is a registered trademark, and Quercustm, Ambertm, and Hessiantm are trademarks of Caucho Technology.