From 22c71e27cef8131c8432b57d5965bd14e5300428 Mon Sep 17 00:00:00 2001 From: Sandy Walsh Date: Tue, 7 Jun 2011 15:36:43 -0300 Subject: Added illustrations for Distributed Scheduler and fixed up formatting --- doc/source/devref/distributed_scheduler.rst | 90 ++++++++++++--------- doc/source/images/costs_weights.png | Bin 0 -> 35723 bytes doc/source/images/dating_service.png | Bin 0 -> 31945 bytes doc/source/images/filtering.png | Bin 0 -> 18779 bytes doc/source/images/nova.compute.api.create.png | Bin 0 -> 50171 bytes .../images/nova.compute.api.create_all_at_once.png | Bin 0 -> 62263 bytes doc/source/images/zone_aware_overview.png | Bin 0 -> 56142 bytes doc/source/images/zone_aware_scheduler.png | Bin 0 -> 20902 bytes 8 files changed, 53 insertions(+), 37 deletions(-) create mode 100644 doc/source/images/costs_weights.png create mode 100644 doc/source/images/dating_service.png create mode 100644 doc/source/images/filtering.png create mode 100755 doc/source/images/nova.compute.api.create.png create mode 100755 doc/source/images/nova.compute.api.create_all_at_once.png create mode 100755 doc/source/images/zone_aware_overview.png create mode 100644 doc/source/images/zone_aware_scheduler.png (limited to 'doc/source') diff --git a/doc/source/devref/distributed_scheduler.rst b/doc/source/devref/distributed_scheduler.rst index eb6a1a03e..cc9e78916 100644 --- a/doc/source/devref/distributed_scheduler.rst +++ b/doc/source/devref/distributed_scheduler.rst @@ -15,10 +15,12 @@ under the License. Distributed Scheduler -===== +===================== The Scheduler is akin to a Dating Service. Requests for the creation of new instances come in and the most applicable Compute nodes are selected from a large pool of potential candidates. In a small deployment we may be happy with the currently available Change Scheduler which randomly selects a Host from the available pool. Or if you need something a little more fancy you may want to use the Availability Zone Scheduler, which selects Compute hosts from a logical partitioning of available hosts (within a single Zone). + .. image:: /images/dating_service.png + But for larger deployments a more complex scheduling algorithm is required. Additionally, if you are using Zones in your Nova setup, you'll need a scheduler that understand how to pass instance requests from Zone to Zone. This is the purpose of the Distributed Scheduler (DS). The DS utilizes the Capabilities of a Zone and its component services to make informed decisions on where a new instance should be created. When making this decision it consults not only all the Compute nodes in the current Zone, but the Compute nodes in each Child Zone. This continues recursively until the ideal host is found. @@ -27,70 +29,82 @@ So, how does this all work? This document will explain the strategy employed by the `ZoneAwareScheduler` and its derivations. You should read the Zones documentation before reading this. + .. image:: /images/zone_aware_scheduler.png + Costs & Weights ----------- +--------------- When deciding where to place an Instance, we compare a Weighted Cost for each Host. The Weighting, currently, is just the sum of each Cost. Costs are nothing more than integers from `0 - max_int`. Costs are computed by looking at the various Capabilities of the Host relative to the specs of the Instance being asked for. Trying to putting a plain vanilla instance on a high performance host should have a very high cost. But putting a vanilla instance on a vanilla Host should have a low cost. Some Costs are more esoteric. Consider a rule that says we should prefer Hosts that don't already have an instance on it that is owned by the user requesting it (to mitigate against machine failures). Here we have to look at all the other Instances on the host to compute our cost. An example of some other costs might include selecting: -* a GPU-based host over a standard CPU -* a host with fast ethernet over a 10mbps line -* a host that can run Windows instances -* a host in the EU vs North America -* etc + * a GPU-based host over a standard CPU + * a host with fast ethernet over a 10mbps line + * a host that can run Windows instances + * a host in the EU vs North America + * etc This Weight is computed for each Instance requested. If the customer asked for 1000 instances, the consumed resources on each Host are "virtually" depleted so the Cost can change accordingly. + .. image:: /images/costs_weights.png + nova.scheduler.zone_aware_scheduler.ZoneAwareScheduler ------------ +------------------------------------------------------ As we explained in the Zones documentation, each Scheduler has a `ZoneManager` object that collects "Capabilities" about child Zones and each of the services running in the current Zone. The `ZoneAwareScheduler` uses this information to make its decisions. Here is how it works: -1. The compute nodes are filtered and the nodes remaining are weighed. -1a. Filtering the hosts is a simple matter of ensuring the compute node has ample resources (CPU, RAM, Disk, etc) to fulfil the request. -1b. Weighing of the remaining compute nodes assigns a number based on their suitability for the request. -2. The same request is sent to each child Zone and step #1 is done there too. The resulting weighted list is returned to the parent. -3. The parent Zone sorts and aggregates all the weights and a final build plan is constructed. -4. The build plan is executed upon. Concurrently, instance create requests are sent to each of the selected hosts, be they local or in a child zone. Child Zones may forward the requests to their child Zones as needed. + 1. The compute nodes are filtered and the nodes remaining are weighed. + 2. Filtering the hosts is a simple matter of ensuring the compute node has ample resources (CPU, RAM, Disk, etc) to fulfil the request. + 3. Weighing of the remaining compute nodes assigns a number based on their suitability for the request. + 4. The same request is sent to each child Zone and step #1 is done there too. The resulting weighted list is returned to the parent. + 5. The parent Zone sorts and aggregates all the weights and a final build plan is constructed. + 6. The build plan is executed upon. Concurrently, instance create requests are sent to each of the selected hosts, be they local or in a child zone. Child Zones may forward the requests to their child Zones as needed. + + .. image:: /images/zone_aware_overview.png `ZoneAwareScheduler` by itself is not capable of handling all the provisioning itself. Derived classes are used to select which host filtering and weighing strategy will be used. Filtering and Weighing ------------- +---------------------- The filtering (excluding compute nodes incapable of fulfilling the request) and weighing (computing the relative "fitness" of a compute node to fulfill the request) rules used are very subjective operations ... Service Providers will probably have a very different set of filtering and weighing rules than private cloud administrators. The filtering and weighing aspects of the `ZoneAwareScheduler` are flexible and extensible. + .. image:: /images/filtering.png + Requesting a new instance ------------- +------------------------- Prior to the `ZoneAwareScheduler`, to request a new instance, a call was made to `nova.compute.api.create()`. The type of instance created depended on the value of the `InstanceType` record being passed in. The `InstanceType` determined the amount of disk, CPU, RAM and network required for the instance. Administrators can add new `InstanceType` records to suit their needs. For more complicated instance requests we need to go beyond the default fields in the `InstanceType` table. `nova.compute.api.create()` performed the following actions: -1. it validated all the fields passed into it. -2. it created an entry in the `Instance` table for each instance requested -3. it put one `run_instance` message in the scheduler queue for each instance requested -4. the schedulers picked off the messages and decided which compute node should handle the request. -5. the `run_instance` message was forwarded to the compute node for processing and the instance is created. -6. it returned a list of dicts representing each of the `Instance` records (even if the instance has not been activated yet). At least the `instance_id`s are valid. + 1. it validated all the fields passed into it. + 2. it created an entry in the `Instance` table for each instance requested + 3. it put one `run_instance` message in the scheduler queue for each instance requested + 4. the schedulers picked off the messages and decided which compute node should handle the request. + 5. the `run_instance` message was forwarded to the compute node for processing and the instance is created. + 6. it returned a list of dicts representing each of the `Instance` records (even if the instance has not been activated yet). At least the `instance_ids` are valid. + + .. image:: /images/nova.compute.api.create.png Generally, the standard schedulers (like `ChanceScheduler` and `AvailabilityZoneScheduler`) only operate in the current Zone. They have no concept of child Zones. The problem with this approach is each request is scattered amongst each of the schedulers. If we are asking for 1000 instances, each scheduler gets the requests one-at-a-time. There is no possability of optimizing the requests to take into account all 1000 instances as a group. We call this Single-Shot vs. All-at-Once. For the `ZoneAwareScheduler` we need to use the All-at-Once approach. We need to consider all the hosts across all the Zones before deciding where they should reside. In order to handle this we have a new method `nova.compute.api.create_all_at_once()`. This method does things a little differently: -1. it validates all the fields passed into it. -2. it creates a single `reservation_id` for all of instances created. This is a UUID. -3. it creates a single `run_instance` request in the scheduler queue -4. a scheduler picks the message off the queue and works on it. -5. the scheduler sends off an OS API `POST /zones/select` command to each child Zone. The `BODY` payload of the call contains the `request_spec`. -6. the child Zones use the `request_spec` to compute a weighted list for each instance requested. No attempt to actually create an instance is done at this point. We're only estimating the suitability of the Zones. -7. if the child Zone has its own child Zones, the `/zones/select` call will be sent down to them as well. -8. Finally, when all the estimates have bubbled back to the Zone that initiated the call, all the results are merged, sorted and processed. -9. Now the instances can be created. The initiating Zone either forwards the `run_instance` message to the local Compute node to do the work, or it issues a `POST /servers` call to the relevant child Zone. The parameters to the child Zone call are the same as what was passed in by the user. -10. The `reservation_id` is passed back to the caller. Later we explain how the user can check on the status of the command with this `reservation_id`. + 1. it validates all the fields passed into it. + 2. it creates a single `reservation_id` for all of instances created. This is a UUID. + 3. it creates a single `run_instance` request in the scheduler queue + 4. a scheduler picks the message off the queue and works on it. + 5. the scheduler sends off an OS API `POST /zones/select` command to each child Zone. The `BODY` payload of the call contains the `request_spec`. + 6. the child Zones use the `request_spec` to compute a weighted list for each instance requested. No attempt to actually create an instance is done at this point. We're only estimating the suitability of the Zones. + 7. if the child Zone has its own child Zones, the `/zones/select` call will be sent down to them as well. + 8. Finally, when all the estimates have bubbled back to the Zone that initiated the call, all the results are merged, sorted and processed. + 9. Now the instances can be created. The initiating Zone either forwards the `run_instance` message to the local Compute node to do the work, or it issues a `POST /servers` call to the relevant child Zone. The parameters to the child Zone call are the same as what was passed in by the user. + 10. The `reservation_id` is passed back to the caller. Later we explain how the user can check on the status of the command with this `reservation_id`. + + .. image:: /images/nova.compute.api.create_all_at_once.png The Catch -------------- +--------- This all seems pretty straightforward but, like most things, there's a catch. Zones are expected to operate in complete isolation from each other. Each Zone has its own AMQP service, database and set of Nova services. But, for security reasons Zones should never leak information about the architectural layout internally. That means Zones cannot leak information about hostnames or service IP addresses outside of its world. When `POST /zones/select` is called to estimate which compute node to use, time passes until the `POST /servers` call is issued. If we only passed the weight back from the `select` we would have to re-compute the appropriate compute node for the create command ... and we could end up with a different host. Somehow we need to remember the results of our computations and pass them outside of the Zone. Now, we could store this information in the local database and return a reference to it, but remember that the vast majority of weights are going be ignored. Storing them in the database would result in a flood of disk access and then we have to clean up all these entries periodically. Recall that there are going to be many many `select` calls issued to child Zones asking for estimates. @@ -117,7 +131,7 @@ Finally, we need to give the user a way to get information on each of the instan `python-novaclient` will be extended to support both of these changes. Host Filter --------------- +----------- As we mentioned earlier, filtering hosts is a very deployment-specific process. Service Providers may have a different set of criteria for filtering Compute nodes than a University. To faciliate this the `nova.scheduler.host_filter` module supports a variety of filtering strategies as well as an easy means for plugging in your own algorithms. @@ -130,21 +144,22 @@ The filter used is determined by the `--default_host_filter` flag, which points To create your own `HostFilter` the user simply has to derive from `nova.scheduler.host_filter.HostFilter` and implement two methods: `instance_type_to_filter` and `filter_hosts`. Since Nova is currently dependent on the `InstanceType` structure, the `instance_type_to_filter` method should take an `InstanceType` and turn it into an internal data structure usable by your filter. This is for backward compatibility with existing OpenStack and EC2 API calls. If you decide to create your own call for creating instances not based on `Flavors` or `InstanceTypes` you can ignore this method. The real work is done in `filter_hosts` which must return a list of host tuples for each appropriate host. The set of all available hosts is in the `ZoneManager` object passed into the call as well as the filter query. The host tuple contains (``, ``) where `` is whatever you want it to be. Cost Scheduler Weighing --------------- +----------------------- Every `ZoneAwareScheduler` derivation must also override the `weigh_hosts` method. This takes the list of filtered hosts (generated by the `filter_hosts` method) and returns a list of weight dicts. The weight dicts must contain two keys: `weight` and `hostname` where `weight` is simply an integer (lower is better) and `hostname` is the name of the host. The list does not need to be sorted, this will be done by the `ZoneAwareScheduler` base class when all the results have been assembled. Simple Zone Aware Scheduling --------------- +---------------------------- The easiest way to get started with the `ZoneAwareScheduler` is to use the `nova.scheduler.host_filter.HostFilterScheduler`. This scheduler uses the default Host Filter as and the `weight_hosts` method simply returns a weight of 1 for all hosts. But, from this, you can see calls being routed from Zone to Zone and follow the flow of things. The `--scheduler_driver` flag is how you specify the scheduler class name. Flags --------------- +----- All this Zone and Distributed Scheduler stuff can seem a little daunting to configure, but it's actually not too bad. Here are some of the main flags you should set in your `nova.conf` file: :: + --allow_admin_api=true --enable_zone_routing=true --zone_name=zone1 @@ -162,6 +177,7 @@ All this Zone and Distributed Scheduler stuff can seem a little daunting to conf Some optional flags which are handy for debugging are: :: + --connection_type=fake --verbose diff --git a/doc/source/images/costs_weights.png b/doc/source/images/costs_weights.png new file mode 100644 index 000000000..b65e98b0c Binary files /dev/null and b/doc/source/images/costs_weights.png differ diff --git a/doc/source/images/dating_service.png b/doc/source/images/dating_service.png new file mode 100644 index 000000000..49f1bd86a Binary files /dev/null and b/doc/source/images/dating_service.png differ diff --git a/doc/source/images/filtering.png b/doc/source/images/filtering.png new file mode 100644 index 000000000..4303bded8 Binary files /dev/null and b/doc/source/images/filtering.png differ diff --git a/doc/source/images/nova.compute.api.create.png b/doc/source/images/nova.compute.api.create.png new file mode 100755 index 000000000..999f39ed9 Binary files /dev/null and b/doc/source/images/nova.compute.api.create.png differ diff --git a/doc/source/images/nova.compute.api.create_all_at_once.png b/doc/source/images/nova.compute.api.create_all_at_once.png new file mode 100755 index 000000000..c3ce86d03 Binary files /dev/null and b/doc/source/images/nova.compute.api.create_all_at_once.png differ diff --git a/doc/source/images/zone_aware_overview.png b/doc/source/images/zone_aware_overview.png new file mode 100755 index 000000000..470e78138 Binary files /dev/null and b/doc/source/images/zone_aware_overview.png differ diff --git a/doc/source/images/zone_aware_scheduler.png b/doc/source/images/zone_aware_scheduler.png new file mode 100644 index 000000000..a144e1212 Binary files /dev/null and b/doc/source/images/zone_aware_scheduler.png differ -- cgit From c2ed9160e9aba986e98a32514cb27ab34be9bf0c Mon Sep 17 00:00:00 2001 From: Sandy Walsh Date: Fri, 10 Jun 2011 09:48:17 -0300 Subject: source illustrations added & spelling/grammar based on comstud's feedback --- doc/source/devref/distributed_scheduler.rst | 16 ++++++++++------ doc/source/devref/zone.rst | 4 ++-- doc/source/image_src/zones_distsched_illustrations.odp | Bin 0 -> 182810 bytes 3 files changed, 12 insertions(+), 8 deletions(-) create mode 100755 doc/source/image_src/zones_distsched_illustrations.odp (limited to 'doc/source') diff --git a/doc/source/devref/distributed_scheduler.rst b/doc/source/devref/distributed_scheduler.rst index cc9e78916..e33fda4d2 100644 --- a/doc/source/devref/distributed_scheduler.rst +++ b/doc/source/devref/distributed_scheduler.rst @@ -14,10 +14,14 @@ License for the specific language governing permissions and limitations under the License. + Source for illustrations in doc/source/image_src/zone_distsched_illustrations.odp + (OpenOffice Impress format) Illustrations are "exported" to png and then scaled + to 400x300 or 640x480 as needed and placed in the doc/source/images directory. + Distributed Scheduler ===================== -The Scheduler is akin to a Dating Service. Requests for the creation of new instances come in and the most applicable Compute nodes are selected from a large pool of potential candidates. In a small deployment we may be happy with the currently available Change Scheduler which randomly selects a Host from the available pool. Or if you need something a little more fancy you may want to use the Availability Zone Scheduler, which selects Compute hosts from a logical partitioning of available hosts (within a single Zone). +The Scheduler is akin to a Dating Service. Requests for the creation of new instances come in and the most applicable Compute nodes are selected from a large pool of potential candidates. In a small deployment we may be happy with the currently available Chance Scheduler which randomly selects a Host from the available pool. Or if you need something a little more fancy you may want to use the Availability Zone Scheduler, which selects Compute hosts from a logical partitioning of available hosts (within a single Zone). .. image:: /images/dating_service.png @@ -27,13 +31,13 @@ This is the purpose of the Distributed Scheduler (DS). The DS utilizes the Capab So, how does this all work? -This document will explain the strategy employed by the `ZoneAwareScheduler` and its derivations. You should read the Zones documentation before reading this. +This document will explain the strategy employed by the `ZoneAwareScheduler` and its derivations. You should read the :doc:`devguide/zones` documentation before reading this. .. image:: /images/zone_aware_scheduler.png Costs & Weights --------------- -When deciding where to place an Instance, we compare a Weighted Cost for each Host. The Weighting, currently, is just the sum of each Cost. Costs are nothing more than integers from `0 - max_int`. Costs are computed by looking at the various Capabilities of the Host relative to the specs of the Instance being asked for. Trying to putting a plain vanilla instance on a high performance host should have a very high cost. But putting a vanilla instance on a vanilla Host should have a low cost. +When deciding where to place an Instance, we compare a Weighted Cost for each Host. The Weighting, currently, is just the sum of each Cost. Costs are nothing more than integers from `0 - max_int`. Costs are computed by looking at the various Capabilities of the Host relative to the specs of the Instance being asked for. Trying to put a plain vanilla instance on a high performance host should have a very high cost. But putting a vanilla instance on a vanilla Host should have a low cost. Some Costs are more esoteric. Consider a rule that says we should prefer Hosts that don't already have an instance on it that is owned by the user requesting it (to mitigate against machine failures). Here we have to look at all the other Instances on the host to compute our cost. @@ -107,7 +111,7 @@ The Catch --------- This all seems pretty straightforward but, like most things, there's a catch. Zones are expected to operate in complete isolation from each other. Each Zone has its own AMQP service, database and set of Nova services. But, for security reasons Zones should never leak information about the architectural layout internally. That means Zones cannot leak information about hostnames or service IP addresses outside of its world. -When `POST /zones/select` is called to estimate which compute node to use, time passes until the `POST /servers` call is issued. If we only passed the weight back from the `select` we would have to re-compute the appropriate compute node for the create command ... and we could end up with a different host. Somehow we need to remember the results of our computations and pass them outside of the Zone. Now, we could store this information in the local database and return a reference to it, but remember that the vast majority of weights are going be ignored. Storing them in the database would result in a flood of disk access and then we have to clean up all these entries periodically. Recall that there are going to be many many `select` calls issued to child Zones asking for estimates. +When `POST /zones/select` is called to estimate which compute node to use, time passes until the `POST /servers` call is issued. If we only passed the weight back from the `select` we would have to re-compute the appropriate compute node for the create command ... and we could end up with a different host. Somehow we need to remember the results of our computations and pass them outside of the Zone. Now, we could store this information in the local database and return a reference to it, but remember that the vast majority of weights are going to be ignored. Storing them in the database would result in a flood of disk access and then we have to clean up all these entries periodically. Recall that there are going to be many many `select` calls issued to child Zones asking for estimates. Instead, we take a rather innovative approach to the problem. We encrypt all the child zone internal details and pass them back the to parent Zone. If the parent zone decides to use a child Zone for the instance it simply passes the encrypted data back to the child during the `POST /servers` call as an extra parameter. The child Zone can then decrypt the hint and go directly to the Compute node previously selected. If the estimate isn't used, it is simply discarded by the parent. It's for this reason that it is so important that each Zone defines a unique encryption key via `--build_plan_encryption_key` @@ -122,7 +126,7 @@ NOTE: The features described in this section are related to the up-coming 'merge The OpenStack API allows a user to list all the instances they own via the `GET /servers/` command or the details on a particular instance via `GET /servers/###`. This mechanism is usually sufficient since OS API only allows for creating one instance at a time, unlike the EC2 API which allows you to specify a quantity of instances to be created. -NOTE: currently the `GET /servers` command is not Zone-aware since all operations done in child Zones are done via a single administrative account. Therefore, asking a child Zone to `GET /servers` would return all the active instances ... and that would be what the user intended. Later, when the Keystone Auth system is integrated with Nova, this functionality will be enabled. +NOTE: currently the `GET /servers` command is not Zone-aware since all operations done in child Zones are done via a single administrative account. Therefore, asking a child Zone to `GET /servers` would return all the active instances ... and that would not be what the user intended. Later, when the Keystone Auth system is integrated with Nova, this functionality will be enabled. We could use the OS API 1.1 Extensions mechanism to accept a `num_instances` parameter, but this would result in a different return code. Instead of getting back an `Instance` record, we would be getting back a `reservation_id`. So, instead, we've implemented a new command `POST /zones/boot` command which is nearly identical to `POST /servers` except that it takes a `num_instances` parameter and returns a `reservation_id`. Perhaps in OS API 2.x we can unify these approaches. @@ -149,7 +153,7 @@ Every `ZoneAwareScheduler` derivation must also override the `weigh_hosts` metho Simple Zone Aware Scheduling ---------------------------- -The easiest way to get started with the `ZoneAwareScheduler` is to use the `nova.scheduler.host_filter.HostFilterScheduler`. This scheduler uses the default Host Filter as and the `weight_hosts` method simply returns a weight of 1 for all hosts. But, from this, you can see calls being routed from Zone to Zone and follow the flow of things. +The easiest way to get started with the `ZoneAwareScheduler` is to use the `nova.scheduler.host_filter.HostFilterScheduler`. This scheduler uses the default Host Filter and the `weight_hosts` method simply returns a weight of 1 for all hosts. But, from this, you can see calls being routed from Zone to Zone and follow the flow of things. The `--scheduler_driver` flag is how you specify the scheduler class name. diff --git a/doc/source/devref/zone.rst b/doc/source/devref/zone.rst index 263560ee2..3dc0f80fd 100644 --- a/doc/source/devref/zone.rst +++ b/doc/source/devref/zone.rst @@ -21,7 +21,7 @@ A Nova deployment is called a Zone. A Zone allows you to partition your deployme The idea behind Zones is, if a particular deployment is not capable of servicing a particular request, the request may be forwarded to (child) Zones for possible processing. Zones may be nested in a tree fashion. -Zones only know about their immediate children, they do not know about their parent Zones and may in fact have more than one parent. Likewise, a Zone's children may themselves have child Zones. +Zones only know about their immediate children, they do not know about their parent Zones and may in fact have more than one parent. Likewise, a Zone's children may themselves have child Zones and, in those cases, the grandchild's internal structure would not be known to the grand-parent. Zones share nothing. They communicate via the public OpenStack API only. No database, queue, user or project definition is shared between Zones. @@ -99,7 +99,7 @@ You can get the `child zone api url`, `nova api key` and `username` from the `no export NOVA_URL="http://192.168.2.120:8774/v1.0/" -This equates to a POST operation to `.../zones/` to add a new zone. No connection attempt to the child zone is done when this command. It only puts an entry in the db at this point. After about 30 seconds the `ZoneManager` in the Scheduler services will attempt to talk to the child zone and get its information. +This equates to a POST operation to `.../zones/` to add a new zone. No connection attempt to the child zone is done with this command. It only puts an entry in the db at this point. After about 30 seconds the `ZoneManager` in the Scheduler services will attempt to talk to the child zone and get its information. Getting a list of child Zones ----------------------------- diff --git a/doc/source/image_src/zones_distsched_illustrations.odp b/doc/source/image_src/zones_distsched_illustrations.odp new file mode 100755 index 000000000..8762a183b Binary files /dev/null and b/doc/source/image_src/zones_distsched_illustrations.odp differ -- cgit From 2a90b44ddd797b7e493bbfbe4de80115c96a9ab4 Mon Sep 17 00:00:00 2001 From: Jason Kölker Date: Thu, 16 Jun 2011 11:27:01 -0500 Subject: initial commit of multinic doc --- doc/source/devref/multinic.rst | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 doc/source/devref/multinic.rst (limited to 'doc/source') diff --git a/doc/source/devref/multinic.rst b/doc/source/devref/multinic.rst new file mode 100644 index 000000000..2a0101078 --- /dev/null +++ b/doc/source/devref/multinic.rst @@ -0,0 +1,26 @@ +MultiNic +======== + +What is it +---------- + +Multinic allows an instance to have more than one vif connected to it. Each vif is represenative of a separate network with its own IP block. + + + +Managers +-------- + +Each of the 3 network managers are designed to run indipendantly of the compute manager. They expose a common API for the compute manager to call to determine and configure the network(s) for an instance. Direct calls to either the network api or especially the DB should be avoided by the virt layers. + +Flat Examples +------------- + + + + +FlatDHCP Examples +----------------- + +VLAN Examples +------------- -- cgit From 9010195558be896bdf536003e00843019a1077d7 Mon Sep 17 00:00:00 2001 From: Jason Kölker Date: Thu, 16 Jun 2011 13:44:38 -0500 Subject: more doc (and by more I mean like 2 or 3 sentances) --- doc/source/devref/multinic.rst | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) (limited to 'doc/source') diff --git a/doc/source/devref/multinic.rst b/doc/source/devref/multinic.rst index 2a0101078..38e750a2d 100644 --- a/doc/source/devref/multinic.rst +++ b/doc/source/devref/multinic.rst @@ -13,14 +13,21 @@ Managers Each of the 3 network managers are designed to run indipendantly of the compute manager. They expose a common API for the compute manager to call to determine and configure the network(s) for an instance. Direct calls to either the network api or especially the DB should be avoided by the virt layers. -Flat Examples -------------- +Flat Manager +------------ + +The flat manager is most similar to a traditional switched network environment. It assumes that the IP routing, DNS, DHCP (possibly) and bridge creation is handled by something else. That is it makes no attemp to configure any of this. It does keep track of a range of IPs for the instances that are connected to the network to be allocated. +Each instance will get a fixed ip from each network's pool. The guest operating system may be configured to gather this information through an agent or by the hypervisor injecting the files, or it may ignore it completly and come up with only a layer 2 connection. -FlatDHCP Examples ------------------ +FlatDHCP Manager +---------------- -VLAN Examples -------------- + + + + +VLAN Manager +------------ -- cgit From c86bfba6e76f749626b2472ed5e3c6eadf9d5529 Mon Sep 17 00:00:00 2001 From: Jason Kölker Date: Thu, 16 Jun 2011 14:44:17 -0500 Subject: add the actual image --- doc/source/devref/multinic.rst | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) (limited to 'doc/source') diff --git a/doc/source/devref/multinic.rst b/doc/source/devref/multinic.rst index 38e750a2d..141763bd7 100644 --- a/doc/source/devref/multinic.rst +++ b/doc/source/devref/multinic.rst @@ -6,8 +6,6 @@ What is it Multinic allows an instance to have more than one vif connected to it. Each vif is represenative of a separate network with its own IP block. - - Managers -------- @@ -16,7 +14,7 @@ Each of the 3 network managers are designed to run indipendantly of the compute Flat Manager ------------ - + .. image:: /images/multinic_flat.png The flat manager is most similar to a traditional switched network environment. It assumes that the IP routing, DNS, DHCP (possibly) and bridge creation is handled by something else. That is it makes no attemp to configure any of this. It does keep track of a range of IPs for the instances that are connected to the network to be allocated. -- cgit From cedd8e5fe0189477bc0658990e7d8ba519d85d02 Mon Sep 17 00:00:00 2001 From: Jason Kölker Date: Thu, 16 Jun 2011 14:44:57 -0500 Subject: add multinic diagram --- doc/source/image_src/multinic_1.odg | Bin 0 -> 12839 bytes doc/source/images/multinic_flat.png | Bin 0 -> 50924 bytes 2 files changed, 0 insertions(+), 0 deletions(-) create mode 100644 doc/source/image_src/multinic_1.odg create mode 100644 doc/source/images/multinic_flat.png (limited to 'doc/source') diff --git a/doc/source/image_src/multinic_1.odg b/doc/source/image_src/multinic_1.odg new file mode 100644 index 000000000..249c105fc Binary files /dev/null and b/doc/source/image_src/multinic_1.odg differ diff --git a/doc/source/images/multinic_flat.png b/doc/source/images/multinic_flat.png new file mode 100644 index 000000000..16d119686 Binary files /dev/null and b/doc/source/images/multinic_flat.png differ -- cgit From 69f346bd9dd5df3df74d18551429db8f310e8d24 Mon Sep 17 00:00:00 2001 From: Jason Kölker Date: Thu, 16 Jun 2011 14:57:22 -0500 Subject: remove the network-host fromt he flat diagram --- doc/source/image_src/multinic_1.odg | Bin 12839 -> 12363 bytes doc/source/images/multinic_flat.png | Bin 50924 -> 40871 bytes 2 files changed, 0 insertions(+), 0 deletions(-) (limited to 'doc/source') diff --git a/doc/source/image_src/multinic_1.odg b/doc/source/image_src/multinic_1.odg index 249c105fc..bbd76b10e 100644 Binary files a/doc/source/image_src/multinic_1.odg and b/doc/source/image_src/multinic_1.odg differ diff --git a/doc/source/images/multinic_flat.png b/doc/source/images/multinic_flat.png index 16d119686..e055e60e8 100644 Binary files a/doc/source/images/multinic_flat.png and b/doc/source/images/multinic_flat.png differ -- cgit From 829319649af615f2b4c51f8ffa9ce9f1a9e50295 Mon Sep 17 00:00:00 2001 From: Jason Kölker Date: Thu, 16 Jun 2011 15:36:29 -0500 Subject: add in dhcp drawing --- doc/source/devref/multinic.rst | 4 +++- doc/source/image_src/multinic_2.odg | Bin 0 -> 13425 bytes doc/source/images/multinic_dhcp.png | Bin 0 -> 54531 bytes 3 files changed, 3 insertions(+), 1 deletion(-) create mode 100644 doc/source/image_src/multinic_2.odg create mode 100644 doc/source/images/multinic_dhcp.png (limited to 'doc/source') diff --git a/doc/source/devref/multinic.rst b/doc/source/devref/multinic.rst index 141763bd7..08617aab5 100644 --- a/doc/source/devref/multinic.rst +++ b/doc/source/devref/multinic.rst @@ -20,10 +20,12 @@ The flat manager is most similar to a traditional switched network environment. Each instance will get a fixed ip from each network's pool. The guest operating system may be configured to gather this information through an agent or by the hypervisor injecting the files, or it may ignore it completly and come up with only a layer 2 connection. +Flat manager requires at least one nova-network process running that will listen to the API queue and respond to queries. It does not need to sit on any of the networks but it does keep track of the ip's it hands out to instances. + FlatDHCP Manager ---------------- - + .. image:: /images/multinic_dhcp.png diff --git a/doc/source/image_src/multinic_2.odg b/doc/source/image_src/multinic_2.odg new file mode 100644 index 000000000..1f1e4251a Binary files /dev/null and b/doc/source/image_src/multinic_2.odg differ diff --git a/doc/source/images/multinic_dhcp.png b/doc/source/images/multinic_dhcp.png new file mode 100644 index 000000000..bce05b595 Binary files /dev/null and b/doc/source/images/multinic_dhcp.png differ -- cgit From 215452cb79e5d006ad57fbe206e886115b864ed0 Mon Sep 17 00:00:00 2001 From: Jason Kölker Date: Fri, 17 Jun 2011 10:46:14 -0500 Subject: add vlan diagram and some text --- doc/source/devref/multinic.rst | 16 ++++++++++------ doc/source/image_src/multinic_3.odg | Bin 0 -> 13598 bytes doc/source/images/multinic_vlan.png | Bin 0 -> 58552 bytes 3 files changed, 10 insertions(+), 6 deletions(-) create mode 100644 doc/source/image_src/multinic_3.odg create mode 100644 doc/source/images/multinic_vlan.png (limited to 'doc/source') diff --git a/doc/source/devref/multinic.rst b/doc/source/devref/multinic.rst index 08617aab5..2d2d1c452 100644 --- a/doc/source/devref/multinic.rst +++ b/doc/source/devref/multinic.rst @@ -4,30 +4,34 @@ MultiNic What is it ---------- -Multinic allows an instance to have more than one vif connected to it. Each vif is represenative of a separate network with its own IP block. +Multinic allows an instance to have more than one vif connected to it. Each vif is representative of a separate network with its own IP block. Managers -------- -Each of the 3 network managers are designed to run indipendantly of the compute manager. They expose a common API for the compute manager to call to determine and configure the network(s) for an instance. Direct calls to either the network api or especially the DB should be avoided by the virt layers. +Each of the network managers are designed to run independently of the compute manager. They expose a common API for the compute manager to call to determine and configure the network(s) for an instance. Direct calls to either the network api or especially the DB should be avoided by the virt layers. Flat Manager ------------ .. image:: /images/multinic_flat.png -The flat manager is most similar to a traditional switched network environment. It assumes that the IP routing, DNS, DHCP (possibly) and bridge creation is handled by something else. That is it makes no attemp to configure any of this. It does keep track of a range of IPs for the instances that are connected to the network to be allocated. +The Flat manager is most similar to a traditional switched network environment. It assumes that the IP routing, DNS, DHCP (possibly) and bridge creation is handled by something else. That is it makes no attempt to configure any of this. It does keep track of a range of IPs for the instances that are connected to the network to be allocated. -Each instance will get a fixed ip from each network's pool. The guest operating system may be configured to gather this information through an agent or by the hypervisor injecting the files, or it may ignore it completly and come up with only a layer 2 connection. +Each instance will get a fixed IP from each network's pool. The guest operating system may be configured to gather this information through an agent or by the hypervisor injecting the files, or it may ignore it completely and come up with only a layer 2 connection. -Flat manager requires at least one nova-network process running that will listen to the API queue and respond to queries. It does not need to sit on any of the networks but it does keep track of the ip's it hands out to instances. +Flat manager requires at least one nova-network process running that will listen to the API queue and respond to queries. It does not need to sit on any of the networks but it does keep track of the IPs it hands out to instances. FlatDHCP Manager ---------------- .. image:: /images/multinic_dhcp.png - +FlatDHCP manager builds on the the Flat manager adding dnsmask (DNS and DHCP) and radvd (Router Advertisement) servers on the bridge for that network. The services run on the host that is assigned to that nework. VLAN Manager ------------ + + .. image:: /images/multinic_vlan.png + +The VLAN manager sets up forwarding to/from a cloudpipe instance in addition to providing dnsmask (DNS and DHCP) and radvd (Router Advertisement) services for each network. diff --git a/doc/source/image_src/multinic_3.odg b/doc/source/image_src/multinic_3.odg new file mode 100644 index 000000000..d29e16353 Binary files /dev/null and b/doc/source/image_src/multinic_3.odg differ diff --git a/doc/source/images/multinic_vlan.png b/doc/source/images/multinic_vlan.png new file mode 100644 index 000000000..9b0e4fd63 Binary files /dev/null and b/doc/source/images/multinic_vlan.png differ -- cgit From 16c481fc6c26877f78c75122c316c22cd216e3c3 Mon Sep 17 00:00:00 2001 From: Jason Kölker Date: Fri, 17 Jun 2011 11:49:20 -0500 Subject: more words --- doc/source/devref/multinic.rst | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'doc/source') diff --git a/doc/source/devref/multinic.rst b/doc/source/devref/multinic.rst index 2d2d1c452..b3a82d341 100644 --- a/doc/source/devref/multinic.rst +++ b/doc/source/devref/multinic.rst @@ -11,6 +11,8 @@ Managers Each of the network managers are designed to run independently of the compute manager. They expose a common API for the compute manager to call to determine and configure the network(s) for an instance. Direct calls to either the network api or especially the DB should be avoided by the virt layers. +On startup a manager looks in the networks table for networks it is assigned and configures itself to support that network. Using the periodic task, they will claim new networks that have no host set. Only one network per network-host will be claimed at a time. This allows for psuedo-loadbalancing if there are multiple network-hosts running. + Flat Manager ------------ @@ -27,11 +29,11 @@ FlatDHCP Manager .. image:: /images/multinic_dhcp.png -FlatDHCP manager builds on the the Flat manager adding dnsmask (DNS and DHCP) and radvd (Router Advertisement) servers on the bridge for that network. The services run on the host that is assigned to that nework. +FlatDHCP manager builds on the the Flat manager adding dnsmask (DNS and DHCP) and radvd (Router Advertisement) servers on the bridge for that network. The services run on the host that is assigned to that nework. The FlatDHCP manager will create its bridge as specified when the network was created on the network-host when the network host starts up or when a new network gets allocated to that host. Compute nodes will also create the bridges as necessary and connect instance VIFs to them. VLAN Manager ------------ .. image:: /images/multinic_vlan.png -The VLAN manager sets up forwarding to/from a cloudpipe instance in addition to providing dnsmask (DNS and DHCP) and radvd (Router Advertisement) services for each network. +The VLAN manager sets up forwarding to/from a cloudpipe instance in addition to providing dnsmask (DNS and DHCP) and radvd (Router Advertisement) services for each network. The manager will create its bridge as specified when the network was created on the network-host when the network host starts up or when a new network gets allocated to that host. Compute nodes will also create the bridges as necessary and conenct instance VIFs to them. -- cgit